Toggle light / dark theme

Humans are usually good at isolating a single voice in a crowd, but computers? Not so much — just ask anyone trying to talk to a smart speaker at a house party. Google may have a surprisingly straightforward solution, however. Its researchers have developed a deep learning system that can pick out specific voices by looking at people’s faces when they’re speaking. The team trained its neural network model to recognize individual people speaking by themselves, and then created virtual “parties” (complete with background noise) to teach the AI how to isolate multiple voices into distinct audio tracks.

The results, as you can see below, are uncanny. Even when people are clearly trying to compete with each other (such as comedians Jon Dore and Rory Scovel in the Team Coco clip above), the AI can generate a clean audio track for one person just by focusing on their face. That’s true even if the person partially obscures their face with hand gestures or a microphone.

Google is currently “exploring opportunities” to use this feature in its products, but there are more than a few prime candidates. It’s potentially ideal for video chat services like Hangouts or Duo, where it could help you understand someone talking in a crowded room. It could also be helpful for speech enhancement in video recording. And there are big implications for accessibility: it could lead to camera-linked hearing aids that boost the sound of whoever’s in front of you, and more effective closed captioning. There are potential privacy issues (this could be used for public eavesdropping), but it wouldn’t be too difficult to limit the voice separation to people who’ve clearly given their consent.

Read more

You might only know JPEG as the default image compression standard, but the group behind it has now branched out into something new: JPEG XS. JPEG XS is described as a new low-energy format designed to stream live video and VR, even over WiFi and 5G networks. It’s not a replacement for JPEG and the file sizes themselves won’t be smaller; it’s just that this new format is optimized specifically for lower latency and energy efficiency. In other words, JPEG is for downloading, but JPEG XS is more for streaming.

The new standard was introduced this week by the Joint Photographic Experts Group, which says that the aim of JPEG XS is to “stream the files instead of storing them in smartphones or other devices with limited memory.” So in addition to getting faster HD content on your large displays, the group also sees JPEG XS as a valuable format for faster stereoscopic VR streaming plus videos streamed by drones and self-driving cars.

“We are compressing less in order to better preserve quality, and we are making the process faster while using less energy,” says JPEG leader Touradj Ebrahimi in a statement. According to Ebrahimi, the JPEG XS video compression will be less severe than with JPEG photos — while JPEG photos are compressed by a factor of 10, JPEG XS is compressed by a factor of 6. The group promises a “visual lossless” quality to the images of JPEG XS.

Read more

Like a miniaturized Moby Dick, the pure-white fish wiggles slowly over the reef, ducking under corals and ascending, then descending again, up and down and all around. Its insides, though, are not flesh, but electronics. And its flexible tail flicking back and forth is not made of muscle and scales, but elastomer.

The Soft Robotic Fish, aka SoFi, is a hypnotic machine, the likes of which the sea has never seen before. In a paper published today in Science Robotics, MIT researchers detail the evolution of the world’s strangest fish, and describe how it could be a potentially powerful tool for scientists to study ocean life.

Scientists designed SoFi to solve several problems that bedevil oceanic robotics. Problem one: communication. Underwater vehicles are typically tethered to a boat because radio waves don’t do well in water. What SoFi’s inventors have opted for instead is sound.

Read more

A Dutch-Texan team found that most Houston-area drowning deaths from Hurricane Harvey occurred outside the zones designated by government as being at higher risk of flooding: the 100- and 500-year floodplains. Harvey, one of the costliest storms in US history, hit southeast Texas on 25 August 2017 causing unprecedented flooding and killing dozens. Researchers at Delft University of Technology in the Netherlands and Rice University in Texas published their results today in the European Geosciences Union journal Natural Hazards and Earth System Sciences.

“It was surprising to me that so many fatalities occurred outside the flood zones,” says Sebastiaan Jonkman, a professor at Delft’s Hydraulic Engineering Department who led the new study.

Drowning caused 80% of Harvey deaths, and the research showed that only 22% of fatalities in Houston’s 4,600-square-kilometre district, Harris County, occurred within the 100-year floodplain, a mapped area that is used as the main indicator of flood risk in the US.

Read more