{"id":226444,"date":"2025-12-03T21:23:21","date_gmt":"2025-12-04T03:23:21","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2025\/12\/mixture-of-experts-powers-the-most-intelligent-frontier-ai-models-runs-10x-faster-on-nvidia-blackwell-nvl72"},"modified":"2025-12-03T21:23:21","modified_gmt":"2025-12-04T03:23:21","slug":"mixture-of-experts-powers-the-most-intelligent-frontier-ai-models-runs-10x-faster-on-nvidia-blackwell-nvl72","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2025\/12\/mixture-of-experts-powers-the-most-intelligent-frontier-ai-models-runs-10x-faster-on-nvidia-blackwell-nvl72","title":{"rendered":"Mixture of Experts Powers the Most Intelligent Frontier AI Models, Runs 10x Faster on NVIDIA Blackwell NVL72"},"content":{"rendered":"<p><\/p>\n<p><iframe style=\"display: block; margin: 0 auto; width: 100%; aspect-ratio: 4\/3; object-fit: contain;\" src=\"https:\/\/www.youtube.com\/embed\/TlmSpAvYwYI?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope;\n   picture-in-picture\" allowfullscreen><\/iframe><\/p>\n<p>\u201cWith GB200 NVL72 and Together AI\u2019s custom optimizations, we are exceeding customer expectations for large-scale inference workloads for MoE models like DeepSeek-V3,\u201d said Vipul Ved Prakash, cofounder and CEO of Together AI. \u201cThe performance gains come from NVIDIA\u2019s full-stack optimizations coupled with Together AI Inference breakthroughs across kernels, runtime engine and speculative decoding.\u201d<\/p>\n<p>This performance advantage is evident across other frontier models.<\/p>\n<p>Kimi K2 Thinking, the most intelligent open-source model, serves as another proof point, achieving 10x better generational performance when deployed on GB200 NVL72.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u201cWith GB200 NVL72 and Together AI\u2019s custom optimizations, we are exceeding customer expectations for large-scale inference workloads for MoE models like DeepSeek-V3,\u201d said Vipul Ved Prakash, cofounder and CEO of Together AI. \u201cThe performance gains come from NVIDIA\u2019s full-stack optimizations coupled with Together AI Inference breakthroughs across kernels, runtime engine and speculative decoding.\u201d This performance [\u2026]<\/p>\n","protected":false},"author":396,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1522,6],"tags":[],"class_list":["post-226444","post","type-post","status-publish","format-standard","hentry","category-innovation","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/226444","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/396"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=226444"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/226444\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=226444"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=226444"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=226444"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}