Apr 3, 2024
Apple researchers develop AI that can ‘see’ and understand screen context
Posted by Kelvin Dafiaghor in category: robotics/AI
Apple researchers have developed a new artificial intelligence system that can understand ambiguous references to on-screen entities as well as conversational and background context, enabling more natural interactions with voice assistants, according to a paper published on Friday.
The system, called ReALM (Reference Resolution As Language Modeling), leverages large language models to convert the complex task of reference resolution — including understanding references to visual elements on a screen — into a pure language modeling problem. This allows ReALM to achieve substantial performance gains compared to existing methods.