Every year, the countries competing in the International Mathematical Olympiad arrive with a booklet of their best, most original problems. Those booklets get shared among delegations, then quietly disappear. No one had ever collected them systematically, cleaned them, and made them available—not for AI researchers testing the limits of mathematical reasoning, and not for the students around the world training for these competitions largely on their own.
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), King Abdullah University of Science and Technology (KAUST), and HUMAIN have now done exactly that.
MathNet is the largest high-quality dataset of proof-based math problems ever created, and it is not closed. Comprising more than 30,000 expert-authored problems and solutions spanning 47 countries, 17 languages, and 143 competitions, it is five times larger than the next biggest dataset of its kind. The work will be presented at the International Conference on Learning Representations (ICLR 2026) in Brazil later this month.
