* Performance: The model is optimized for “agentic coding” tasks. In benchmarks, it scored 59.5 on SWE-bench Pro, surpassing Google’s Gemini 3.1 Pro and slightly exceeding OpenAI’s GPT-5.5. It also performed strongly on other agent and reasoning tests.
* Inference and Release: Before its official launch, it operated anonymously on OpenRouter as “Owl Alpha,” becoming one of the platform’s top three most-used models. The model weights and technical infrastructure are expected to be released soon on platforms like Hugging Face. API pricing is set at $0.75 per million input tokens and $3 per million output tokens, with promotional rates available.
Meituan trained LongCat-2.0 on over 50,000 unnamed Chinese AI ASICs arranged in superpods with high-bandwidth interconnects. The chips share architectural similarities with Huawei’s Ascend 910C series, though Meituan has not publicly named the exact vendor.
The training run consumed more than 35 trillion tokens, including hundreds of billions of tokens with approximately 1-million-token context lengths. This level of scale — previously achieved only on NVIDIA GPUs or Google TPUs — required extensive custom engineering in parallelism, fault tolerance, and numerical stability.
The team implemented 6D parallelism (tensor, context, expert, data, pipeline, and embedding parallelism) to efficiently distribute both the MoE layers and the novel embedding components across the cluster.
