Menu

Blog

Jul 21, 2024

Cognition: Devin was evaluated on a random 25% subset of the dataset

Posted by in category: robotics/AI

Devin was unassisted, whereas all other models were assisted (meaning the model was told exactly which files need to be edited).

We plan to publish a more detailed technical report soon—stay tuned for more details.

We are an applied AI lab focused on reasoning. ‍ We’re building AI teammates with capabilities far beyond today’s existing AI tools. By solving reasoning, we can unlock new possibilities in a wide range of disciplines—code is just the beginning. We want to help people around the world turn their ideas into reality.

Leave a reply