Toggle light / dark theme

Scaling up learning across many different robot types

Robots are great specialists, but poor generalists. Typically, you have to train a model for each task, robot, and environment. Changing a single variable often requires starting from scratch. But what if we could combine the knowledge across robotics and create a way to train a general-purpose robot?

Today, we are launching a new set of resources for general-purpose robotics learning across different robot types, or embodiments. Together with partners from 33 academic labs we have pooled data from 22 different robot types to create the Open X-Embodiment dataset. We also release RT-1-X, a robotics transformer (RT) model derived from RT-1 and trained on our dataset, that shows skills transfer across many robot embodiments.

In this work, we show training a single model on data from multiple embodiments leads to significantly better performance across many robots than those trained on data from individual embodiments. We tested our RT-1-X model in five different research labs, demonstrating 50% success rate improvement on average across five different commonly used robots compared to methods developed independently and specifically for each robot. We also showed that training our visual language action model, RT-2, on data from multiple embodiments tripled its performance on real-world robotic skills.

Why Big Tech’s bet on AI assistants is so risky

This is a risky bet, given the limitations of the technology. Tech companies have not solved some of the persistent problems with AI language models, such as their propensity to make things up or “hallucinate.” But what concerns me the most is that they are a security and privacy disaster, as I wrote earlier this year. Tech companies are putting this deeply flawed tech in the hands of millions of people and allowing AI models access to sensitive information such as their emails, calendars, and private messages. In doing so, they are making us all vulnerable to scams, phishing, and hacks on a massive scale.

I’ve covered the significant security problems with AI language models before. Now that AI assistants have access to personal information and can simultaneously browse the web, they are particularly prone to a type of attack called indirect prompt injection. It’s ridiculously easy to execute, and there is no known fix.

In an indirect prompt injection attack, a third party “alters a website by adding hidden text that is meant to change the AI’s behavior,” as I wrote in April. “Attackers could use social media or email to direct users to websites with these secret prompts. Once that happens, the AI system could be manipulated to let the attacker try to extract people’s credit card information, for example.” With this new generation of AI models plugged into social media and emails, the opportunities for hackers are endless.

AI is coming to the Arc browser — but probably not like you think

Sure, you could just stick a ChatGPT sidebar in your browser. But what do we really want AI to do for us as we use the web? That’s the much harder question.

At some point, if you’re a company doing pretty much anything in the year 2023, you have to have an AI strategy. It’s just business. You can make a ChatGPT plug-in. You can do a sidebar. You can bet your entire trillion-dollar company on AI being the future of how everyone does everything. But you have to do something.

The last one of these was crypto and the blockchain a couple of years ago, and Josh Miller, the CEO of The Browser Company, which makes the popular new Arc browser, says he’s… More.


AI is coming for your online life… but nobody’s exactly sure how it’s going to work.

Zoom Docs launches in 2024 with built-in AI collaboration features

Zoom’s selling a cheaper AI package than Microsoft 365 Copilot and Google Duet AI, and soon it can plug into a new ‘modular workspace.’

At Zoomtopia 2023 today, Zoom announced Zoom Docs, a collaboration-focused “modular workspace” that integrates the company’s Zoom AI Companion for generating new content or populating a doc from other sources — you know the drill by now.

Along with the Mail and Calendar offerings launched during last year’s event, Zoom Docs is another step toward a full office suite alternative to Google Workspace and Microsoft 365, which both have started to integrate AI-powered tools of their own, dubbed Duet AI and Copilot, respectively. The company says it will be widely… More.


Zoom’s new tool expands beyond the Zoom meeting.

Oracle Doubles Down On Generative AI Trend At CloudWorld 2023

Last week at its annual CloudWorld event in Las Vegas, Oracle showed that it, too, is going full throttle on generative AI–and that it has no plans to cower to its biggest rival Amazon Web Services (AWS.)Before we get into the CloudWorld event itself, it’s important to take a tiny step back to September 14 when the company announced a new partnership with Microsoft that puts Oracle database services on Oracle Cloud Infrastructure (OCI) in Microsoft Azure. The new Oracle Database@Azure makes Microsoft and Oracle the only two hyperscalers to offer OCI to help simplify cloud migration, deployment and management. Especially when you consider that the partners have achieved rate card and… More.


This year at Oracle CloudWorld, the company advanced its generative AI strategy across its cloud infrastructure, apps, and platforms. Exploring this year’s announcements.

New AI model can tell if you need lung cancer screening

Lung cancer screening is crucial for decreasing the death count from the disease but the government can’t scan everyone’s lungs. Here is an AI that identifies people who actually need screening.

Lung cancer is the deadliest cancer type, killing over a million people annually across the globe. The disease is responsible for the highest number of cancer deaths in both men and women in the US.

In fact, the death toll from lung cancer among women and men is nearly triple that of breast cancer and prostate cancer, respectively.

Spotify users may be able to generate AI playlists using prompts

A product designer spotted prompts in Spotify’s codes.

Users may soon be able to create artificial intelligence-generated Spotify playlists using prompts. Speculations are rife ever since hashtag creator and product designer Chris Messina posted pictures of code from Spotify’s backend on Threads.

It would be something like OpenAI’s chatbot ChatGPT, but for creating a song playlist.


Spotify appears to be developing AI-powered playlists. References discovered in the app’s code indicate the company may be developing generative AI playlists users could create using prompts.

Samsung to develop AI chips with Canadian startup Tenstorrent

The race to develop AI chips continues as Samsung’s chip manufacturing department partnered with Canadian startup Tenstorrent to produce chips and intellectual property for data centers.

The Canadian startup Tenstorrent, which builds artificial intelligence (AI) processors, among other things, revealed a new partnership with Samsung’s chip manufacturing department.

On Oct. 2, the startup announced the partnership with Samsung, saying it will use it to bring the “next generation of AI chiplets to market.” Tenstorrent manufactures chips and intellectual property (IP) for data centers.


The development comes as dominance in the AI chip market is currently held by American tech manufacturer Nvidia.

ChatGPT forces us to ask: how much of “being human” belongs to us?

ChatGPT is a hot topic at my university, where faculty members are deeply concerned about academic integrity, while administrators urge us to “embrace the benefits” of this “new frontier.” It’s a classic example of what my colleague Punya Mishra calls the “doom-hype cycle” around new technologies. Likewise, media coverage of human-AI interaction – whether paranoid or starry-eyed – tends to emphasize its newness.

In one sense, it is undeniably new. Interactions with ChatGPT can feel unprecedented, as when a tech journalist couldn’t get a chatbot to stop declaring its love for him. In my view, however, the boundary between humans and machines, in terms of the way we interact with one another, is fuzzier than most people would care to admit, and this fuzziness accounts for a good deal of the discourse swirling around ChatGPT.

When I’m asked to check a box to confirm I’m not a robot, I don’t give it a second thought – of course I’m not a robot. On the other hand, when my email client suggests a word or phrase to complete my sentence, or when my phone guesses the next word I’m about to text, I start to doubt myself. Is that what I meant to say? Would it have occurred to me if the application hadn’t suggested it? Am I part robot? These large language models have been trained on massive amounts of “natural” human language. Does this make the robots part human?

/* */