Toggle light / dark theme

BLINK: Multimodal Large Language Models Can See but Not Perceive

Posted in futurism

From upenn BLINK multimodal large language models can see but not perceive.

From UPenn.

Multimodal large language models can see but not perceive.

We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses on core visual perception abilities not found in other evaluations.


Join the discussion on this paper page.