Toggle light / dark theme

AI-powered headphones offer group translation with voice cloning and 3D spatial audio

Tuochao Chen, a University of Washington doctoral student, recently toured a museum in Mexico. Chen doesn’t speak Spanish, so he ran a translation app on his phone and pointed the microphone at the tour guide. But even in a museum’s relative quiet, the surrounding noise was too much. The resulting text was useless.

Various technologies have emerged lately promising fluent translation, but none of these solved Chen’s problem of . Meta’s new glasses, for instance, function only with an isolated speaker; they play an automated voice translation after the speaker finishes.

Now, Chen and a team of UW researchers have designed a headphone system that translates several speakers at once, while preserving the direction and qualities of people’s voices. The team built the system, called Spatial Speech Translation, with off-the-shelf noise-canceling headphones fitted with microphones. The team’s algorithms separate out the different speakers in a space and follow them as they move, translate their speech and play it back with a 2–4 second delay.

Leave a Comment

Lifeboat Foundation respects your privacy! Your email address will not be published.