When Chinese AI startup DeepSeek unveiled the open-source large language model DeepSeek-R1 in January, many referred to it as the “AI Sputnik shock” — a reference to the monumental significance of the Soviet Union’s 1957 launch of the first satellite into orbit.
Much remains uncertain about DeepSeek’s LLM and its capabilities should not be overestimated — but its release nevertheless has sparked intense discussion about its superiority especially in terms of cost. DeepSeek claims that its model possesses reasoning abilities on par with or even superior to OpenAI’s leading models, with training costs at less than one-tenth of OpenAI’s — reportedly just $5.6 million — largely due to the use of NVIDIA’s lower-cost H800 GPUs rather than the more powerful H200 or H100 models.
Tech giants like Meta and Google have spent billions of dollars on high-performance GPUs to develop cutting-edge AI models. However, DeepSeek’s ability to produce a high-performance AI model at a significantly lower cost challenges the prevailing belief that computational power—determined by the number and quality of GPUs—is the primary driver of AI performance.