{"id":125218,"date":"2021-07-21T01:23:00","date_gmt":"2021-07-21T08:23:00","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2021\/07\/nvidia-releases-tensorrt-8-for-faster-ai-inference"},"modified":"2021-07-21T01:23:00","modified_gmt":"2021-07-21T08:23:00","slug":"nvidia-releases-tensorrt-8-for-faster-ai-inference","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2021\/07\/nvidia-releases-tensorrt-8-for-faster-ai-inference","title":{"rendered":"Nvidia releases TensorRT 8 for faster AI inference"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/nvidia-releases-tensorrt-8-for-faster-ai-inference.jpg\"><\/a><\/p>\n<p>Nvidia today announced the release of <a href=\"https:\/\/venturebeat.com\/tag\/tensorrt\/\">TensorRT<\/a> 8, the latest version of its software development kit (SDK) designed for AI and machine learning inference. Built for deploying AI models that can power search engines, ad recommendations, chatbots, and more, Nvidia claims that TensorRT 8 cuts inference time in half for language queries compared with the previous release of TensorRT.<\/p>\n<p>Models are growing increasingly complex, and demand is on the rise for real-time deep learning applications. According to a recent O\u2019Reilly <a href=\"https:\/\/www.oreilly.com\/radar\/ai-adoption-in-the-enterprise-2021\/\">survey<\/a>, 86.7% of organizations are now considering, evaluating, or putting into production AI products. And Deloitte <a href=\"https:\/\/www.zdnet.com\/article\/53-of-enterprises-spending-more-than-20-million-a-year-on-ai-technology-talent\/\">reports<\/a> that 53% of enterprises adopting AI spent more than $20 million in 2019 and 2020 on technology and talent.<\/p>\n<p>TensorRT essentially dials a model\u2019s mathematical coordinates to a balance of the smallest model size with the highest accuracy for the system it\u2019ll run on. Nvidia claims that TensorRT-based apps perform up to 40 times faster than CPU-only platforms during inference, and that TensorRT 8-specific optimizations allow BERT-Large \u2014 one of the most popular <a href=\"https:\/\/venturebeat.com\/2021\/07\/15\/why-transformers-offer-more-than-meets-the-eye\/\">Transformer<\/a>-based models \u2014 to run in 1.2 milliseconds.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Nvidia today announced the release of TensorRT 8, the latest version of its software development kit (SDK) designed for AI and machine learning inference. Built for deploying AI models that can power search engines, ad recommendations, chatbots, and more, Nvidia claims that TensorRT 8 cuts inference time in half for language queries compared with the [\u2026]<\/p>\n","protected":false},"author":396,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2229,6],"tags":[],"class_list":["post-125218","post","type-post","status-publish","format-standard","hentry","category-mathematics","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/125218","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/396"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=125218"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/125218\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=125218"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=125218"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=125218"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}