In the race to develop AI that understands complex images like financial forecasts, medical diagrams and nutrition labels—essential for AI to operate independently in everyday settings—closed-source systems like ChatGPT and Claude currently set the pace. But no one outside their makers knows how those models were trained or what data they used, leaving open-source alternatives scrambling to catch up.
Now, researchers at Penn Engineering and the Allen Institute for AI (Ai2) have developed a new approach to train open-source models: using AI to create scientific figures, charts and tables that teach other AI systems how to interpret complex visual information.
Their tool, CoSyn (short for Code-Guided Synthesis), taps open-source AI models’ coding skills to render text-rich images and generate relevant questions and answers, giving other AI systems the data they need to learn how to “see” and understand scientific figures.