Toggle light / dark theme

MIT Machine-Learning System IDs Objects In Photos

Computer scientists at MIT have developed a machine-learning system that can identify objects in an image based on a spoken description of the image.

Typical speech recognition systems like Google Voice and Siri rely on transcriptions of thousands of hours of speech recordings, which are then used to map speech signals to specific words.

Still in its early stages, the MIT system learns words from recorded speech clips and objects in images and then links them. Several hundred different works and objects can be recognized so far, with expectations that future versions can advance to a larger scale.

Top 10 Emerging Technologies of 2018

Disruptive solutions that are poised to change the world — a special report produced by Scientific American in collaboration with the World Economic Forum.


Scientific American is the essential guide to the most awe-inspiring advances in science and technology, explaining how they change our understanding of the world and shape our lives.