{"id":160748,"date":"2023-03-21T22:22:26","date_gmt":"2023-03-22T03:22:26","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2023\/03\/nvidia-announces-h100-nvl-max-memory-server-card-for-large-language-models"},"modified":"2023-03-21T22:22:26","modified_gmt":"2023-03-22T03:22:26","slug":"nvidia-announces-h100-nvl-max-memory-server-card-for-large-language-models","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2023\/03\/nvidia-announces-h100-nvl-max-memory-server-card-for-large-language-models","title":{"rendered":"NVIDIA Announces H100 NVL \u2014 Max Memory Server Card for Large Language Models"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/nvidia-announces-h100-nvl-max-memory-server-card-for-large-language-models.jpg\"><\/a><\/p>\n<p>ChatGPT is currently deployed on A100 chips that have 80 GB of cache each. Nvidia decided this was a bit wimpy so they developed much faster H100 chips (H100 is about twice as fast as A100) that have 94 GB of cache each and then found a way to put two of them on a card with high speed connections between them for a total of 188 GB of cache per card.<\/p>\n<p>So hardware is getting more and more impressive!<\/p>\n<hr>\n<p>While this year\u2019s Spring GTC event doesn\u2019t feature any new GPUs or GPU architectures from NVIDIA, the company is still in the process of rolling out new products based on the Hopper and Ada Lovelace GPUs its introduced in the past year. At the high-end of the market, the company today is announcing a new H100 accelerator variant specifically aimed at large language model users: the H100 NVL.<\/p>\n<p>The H100 NVL is an interesting variant on <a href=\"https:\/\/www.anandtech.com\/show\/17581\/nvidia-h100-hopper-accelerator-now-in-full-production-dgx-shipping-in-q1-23\">NVIDIA\u2019s H100 PCIe card<\/a> that, in a sign of the times and NVIDIA\u2019s extensive success in the AI field, is aimed at a singular market: large language model (LLM) deployment. There are a few things that make this card atypical from NVIDIA\u2019s usual server fare \u2013 not the least of which is that it\u2019s 2 H100 PCIe boards that come already bridged together \u2013 but the big takeaway is the big memory capacity. The combined dual-GPU card offers 188GB of HBM3 memory \u2013 94GB per card \u2013 offering more memory per GPU than any other NVIDIA part to date, even within the H100 family.<\/p>\n<p>Driving this SKU is a specific niche: memory capacity. Large language models like the GPT family are in many respects memory capacity bound, as they\u2019ll quickly fill up even an H100 accelerator in order to hold all of their parameters (175B in the case of the largest GPT-3 models). As a result, NVIDIA has opted to scrape together a new H100 SKU that offers a bit more memory per GPU than their usual H100 parts, which top out at 80GB per GPU.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>ChatGPT is currently deployed on A100 chips that have 80 GB of cache each. Nvidia decided this was a bit wimpy so they developed much faster H100 chips (H100 is about twice as fast as A100) that have 94 GB of cache each and then found a way to put two of them on a [\u2026]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-160748","post","type-post","status-publish","format-standard","hentry","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/160748","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=160748"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/160748\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=160748"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=160748"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=160748"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}