Toggle light / dark theme

Paper page — Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Mini-Gemini.

Mining the potential of multi-modality vision language models.

In this work, we introduce Mini-Gemini, a simple and effective framework enhancing multi-modality Vision Language Models (VLMs).


Join the discussion on this paper page.