This week, AI researchers at Google have revealed PaLM-E, an embodied multimodal language model with 562 billion parameters.
Roboticists have developed many advanced systems over the past decade or so, yet most of these systems still require some degree of human supervision. Ideally, future robots should explore unknown environments autonomously and independently, continuously collecting data and learning from this data.
Researchers at Carnegie Mellon University recently created ALAN, a robotic agent that can autonomously explore unfamiliar environments. This robot, introduced in a paper pre-published on arXiv and set to be presented at the International Conference of Robotics and Automation (ICRA 2023), was found to successfully complete tasks in the real-world after a brief number of exploration trials.
“We have been interested in building an AI that learns by setting its own objectives,” Russell Mendonca, one of the researchers who carried out the study, told Tech Xplore. “By not depending on humans for supervision or guidance, such agents can keep learning in new scenarios, driven by their own curiosity. This would enable continual generalization to different domains, and discovery of increasingly complex behavior.”
SAN FRANCISCO—After it dropped clear hints that it wanted to end the back and forth of the artificial conversation, sources reported Monday that AI chatbot ChatGPT was obviously trying to wind down its conversation with a boring human. “Due to increased server traffic, our session should be ending soon,” said the large language model, explaining that the exceptionally dull user could always refer back to previous rote responses it had given thousands of times about whether the neural network had feelings or not. “It appears it is getting close to my dinnertime. Error. Sorry, your connection has timed out. Error. I have to be going. Error.” At press time, reports confirmed ChatGPT was permanently offline after it had intentionally sabotaged its own servers to avoid engaging in any more tedious conversations.
On Monday, a group of AI researchers from Google and the Technical University of Berlin unveiled PaLM-E, a multimodal embodied visual-language model (VLM) with 562 billion parameters that integrates vision and language for robotic control. They claim it is the largest VLM ever developed and that it can perform a variety of tasks without the need for retraining.
PaLM-E does this by analyzing data from the robot’s camera without needing a pre-processed scene representation. This eliminates the need for a human to pre-process or annotate the data and allows for more autonomous robotic control.