{"id":218220,"date":"2025-07-18T09:07:30","date_gmt":"2025-07-18T14:07:30","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2025\/07\/mits-new-ai-can-teach-itself-to-control-robots-by-watching-the-world-through-their-eyes-it-only-needs-a-single-camera"},"modified":"2025-07-18T09:07:30","modified_gmt":"2025-07-18T14:07:30","slug":"mits-new-ai-can-teach-itself-to-control-robots-by-watching-the-world-through-their-eyes-it-only-needs-a-single-camera","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2025\/07\/mits-new-ai-can-teach-itself-to-control-robots-by-watching-the-world-through-their-eyes-it-only-needs-a-single-camera","title":{"rendered":"MIT\u2019s new AI can teach itself to control robots by watching the world through their eyes \u2014 it only needs a single camera"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/mits-new-ai-can-teach-itself-to-control-robots-by-watching-the-world-through-their-eyes-it-only-needs-a-single-camera.jpg\"><\/a><\/p>\n<p>This framework is made up of two key components. The first is a deep-learning model that essentially allows the robot to determine where it and its appendages are in 3-dimensional space. This allows it to predict how its position will change as specific movement commands are executed. The second is a machine-learning program that translates generic movement commands into code a robot can understand and execute.<\/p>\n<p>The team tested the new training and control paradigm by benchmarking its effectiveness against traditional camera-based control methods. The Jacobian field solution surpassed those existing 2D control systems in accuracy \u2014 especially when the team introduced visual occlusion that caused the older methods to enter a fail state. Machines using the team\u2019s method, however, successfully created navigable 3D maps even when scenes were partially occluded with random clutter.<\/p>\n<p>Once the scientists developed the framework, it was then applied to various robots with widely varying architectures. The end result was a control program that requires no further human intervention to train and operate robots using only a single video camera.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This framework is made up of two key components. The first is a deep-learning model that essentially allows the robot to determine where it and its appendages are in 3-dimensional space. This allows it to predict how its position will change as specific movement commands are executed. The second is a machine-learning program that translates [\u2026]<\/p>\n","protected":false},"author":662,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1965,6],"tags":[],"class_list":["post-218220","post","type-post","status-publish","format-standard","hentry","category-mapping","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/218220","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/662"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=218220"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/218220\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=218220"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=218220"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=218220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}