ChatGPT’s New Upgrade Teases AI’s Multimodal Future

ChatGPT isn’t just a chatbot anymore.

OpenAI’s latest upgrade grants ChatGPT powerful new abilities that go beyond text. It can tell bedtime stories in its own AI voice, identify objects in photos, and respond to audio recordings. These capabilities represent the next big thing in AI: multimodal models.

“Multimodal is the next generation of these large models, where it can process not just text, but also images, audio, video, and even other modalities,” says Dr. Linxi “Jim” Fan, Senior AI Research Scientist at Nvidia.

OpenAI’s chatbot learns to carry a conversation—and expect competition.

Blog