{"id":208098,"date":"2025-03-07T17:15:41","date_gmt":"2025-03-07T23:15:41","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2025\/03\/pokechamp-an-expert-level-minimax-language-agent"},"modified":"2025-03-07T17:15:41","modified_gmt":"2025-03-07T23:15:41","slug":"pokechamp-an-expert-level-minimax-language-agent","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2025\/03\/pokechamp-an-expert-level-minimax-language-agent","title":{"rendered":"Pok\u00e9Champ: an Expert-level Minimax Language Agent"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/logo.pokechamp-an-expert-level-minimax-language-agent2.jpg\"><\/a><\/p>\n<p>We introduce Pok\u00e9Champ, a minimax agent powered by Large Language Models (LLMs) for Pok\u00e9mon battles. Built on a general framework for two-player competitive games, Pok\u00e9Champ leverages the generalist capabilities of LLMs to enhance minimax tree search. Specifically, LLMs replace three key modules: player action sampling, opponent modeling, and value function estimation, enabling the agent to effectively utilize gameplay history and human knowledge to reduce the search space and address partial observability. Notably, our framework requires no additional LLM training. We evaluate Pok\u00e9Champ in the popular Gen 9 OU format. When powered by GPT-4o, it achieves a win rate of 76% against the best existing LLM-based bot and 84% against the strongest rule-based bot, demonstrating its superior performance. Even with an open-source 8-billion-parameter Llama 3.1 model, Pok\u00e9Champ consistently outperforms the previous best LLM-based bot, Pok\u00e9llmon powered by GPT-4o, with a 64% win rate. Pok\u00e9Champ attains a projected Elo of 1300\u20131500 on the Pok\u00e9mon Showdown online ladder, placing it among the top 30%-10% of human players. In addition, this work compiles the largest real-player Pok\u00e9mon battle dataset, featuring over 3 million games, including more than 500k high-Elo matches. Based on this dataset, we establish a series of battle benchmarks and puzzles to evaluate specific battling skills. We further provide key updates to the local game engine. We hope this work fosters further research that leverage Pok\u00e9mon battle as benchmark to integrate LLM technologies with game-theoretic algorithms addressing general multiagent problems. Videos, code, and dataset available at this https URL.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We introduce Pok\u00e9Champ, a minimax agent powered by Large Language Models (LLMs) for Pok\u00e9mon battles. Built on a general framework for two-player competitive games, Pok\u00e9Champ leverages the generalist capabilities of LLMs to enhance minimax tree search. Specifically, LLMs replace three key modules: player action sampling, opponent modeling, and value function estimation, enabling the agent to [\u2026]<\/p>\n","protected":false},"author":709,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1509,41,6,8],"tags":[],"class_list":["post-208098","post","type-post","status-publish","format-standard","hentry","category-entertainment","category-information-science","category-robotics-ai","category-space"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/208098","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/709"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=208098"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/208098\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=208098"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=208098"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=208098"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}