Today, leading AI technology such as large language models (LLMs) have begun to transform how we access and work with abstract knowledge. Yet they remain wordsmiths in the dark, eloquent but inexperienced, knowledgeable but ungrounded.
For humans, spatial intelligence is the scaffolding upon which our cognition is built. It’s at work when we passively observe or actively seek to create. It drives our reasoning and planning, even on the most abstract topics. And it’s essential to the way we interact—verbally or physically, with our peers or with the environment itself. When machines are endowed with this ability, it will transform how we create and interact with real and virtual worlds—revolutionizing storytelling, robotics, scientific discovery, and beyond. This is AI’s next frontier, and why 2025 was such a pivotal year.
The candid truth is that AI’s spatial capabilities remain far from the human level. But tremendous progress has indeed been made. Multimodal LLMs, trained with voluminous multimedia data in addition to textual data, have introduced some basics of spatial awareness, and today’s AI can analyze pictures, answer questions about them, and generate hyperrealistic images and short videos.
