Apple researchers develop AI that can ‘see’ and understand screen context

Apr 3, 2024

Apple researchers develop AI that can ‘see’ and understand screen context

Posted by Kelvin Dafiaghor in category: robotics/AI

Apple researchers have developed a new artificial intelligence system that can understand ambiguous references to on-screen entities as well as conversational and background context, enabling more natural interactions with voice assistants, according to a paper published on Friday.

The system, called ReALM (Reference Resolution As Language Modeling), leverages large language models to convert the complex task of reference resolution — including understanding references to visual elements on a screen — into a pure language modeling problem. This allows ReALM to achieve substantial performance gains compared to existing methods.

0 comments