Imagine trying to make an accurate three-dimensional model of a building using only pictures taken from different angles—but you’re not sure where or how far away all the cameras were. Our big human brains can fill in a lot of those details, but computers have a much harder time doing so.
This scenario is a well-known problem in computer vision and robot navigation systems. Robots, for instance, must take in lots of 2D information and make 3D point clouds —collections of data points in 3D space—in order to interpret a scene. But the mathematics involved in this process is challenging and error-prone, with many ways for the computer to incorrectly estimate distances. It’s also slow, because it forces the computer to create its 3D point cloud bit by bit.
Computer scientists at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) think they have a better method: A breakthrough algorithm that lets computers reconstruct high-quality 3D scenes from 2D images much more quickly than existing methods.