{"id":222010,"date":"2025-09-17T00:24:34","date_gmt":"2025-09-17T05:24:34","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2025\/09\/doing-the-math-on-cpu-native-ai-inference"},"modified":"2025-09-17T00:24:34","modified_gmt":"2025-09-17T05:24:34","slug":"doing-the-math-on-cpu-native-ai-inference","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2025\/09\/doing-the-math-on-cpu-native-ai-inference","title":{"rendered":"Doing The Math On CPU-Native AI Inference"},"content":{"rendered":"<p><\/p>\n<p><iframe style=\"display: block; margin: 0 auto; width: 100%; aspect-ratio: 4\/3; object-fit: contain;\" src=\"https:\/\/www.youtube.com\/embed\/3jU_YhZ1NQA?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope;\n   picture-in-picture\" allowfullscreen><\/iframe><\/p>\n<p>A number of chip companies \u2014 importantly Intel and IBM, but also the Arm collective and AMD \u2014 have come out recently with new CPU designs that feature native Artificial Intelligence (AI) and its related machine learning (ML). The need for math engines specifically designed to support machine learning algorithms, particularly for inference workloads but also for certain kinds of training, has been covered extensively here at <em><i>The Next Platform<\/i><\/em>.<\/p>\n<p>Just to rattle off a few of them, consider <a href=\"https:\/\/www.nextplatform.com\/2020\/08\/18\/ibm-brings-an-architecture-gun-to-a-chip-knife-fight\/\">the impending \u201cCirrus\u201d Power10 processor from IBM<\/a>, which is due in a matter of days from Big Blue in its high-end NUMA machines and which has a new matrix math engine aimed at accelerating machine learning. Or <a href=\"https:\/\/www.nextplatform.com\/2021\/08\/23\/ibm-bets-big-on-native-inference-with-big-iron\/\">IBM\u2019s \u201cTelum\u201d z16 mainframe processor<\/a> coming next year, which was unveiled at the recent Hot Chips conference and which has a dedicated mixed precision matrix math core for the CPU cores to share. Intel is adding its <a href=\"https:\/\/www.nextplatform.com\/2021\/08\/19\/with-amx-intel-adds-ai-ml-sparkle-to-sapphire-rapids\/\">Advanced Matrix Extensions (AMX) to its future \u201cSapphire Rapids\u201d Xeon SP processors<\/a>, which should have been here by now but which have been pushed out to early next year. Arm Holdings has created future Arm core designs, <a href=\"https:\/\/www.nextplatform.com\/2021\/04\/27\/arm-puts-some-muscle-into-future-neoverse-server-cpu-designs\/\">the \u201cZeus\u201d V1 core and the \u201cPerseus\u201d N2 core<\/a>, that will have substantially wider vector engines that support the mixed precision math commonly used for machine learning inference, too. Ditto for the vector engines in <a href=\"https:\/\/www.nextplatform.com\/2021\/03\/26\/deep-dive-into-amds-milan-epyc-7003-architecture\/\">the \u201cMilan\u201d Epyc 7,003 processors from AMD<\/a>.<\/p>\n<p>All of these chips are designed to keep inference on the CPUs, where in a lot of cases it belongs because of data security, data compliance, and application latency reasons.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A number of chip companies \u2014 importantly Intel and IBM, but also the Arm collective and AMD \u2014 have come out recently with new CPU designs that feature native Artificial Intelligence (AI) and its related machine learning (ML). The need for math engines specifically designed to support machine learning algorithms, particularly for inference workloads but [\u2026]<\/p>\n","protected":false},"author":732,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41,2229,6,1492],"tags":[],"class_list":["post-222010","post","type-post","status-publish","format-standard","hentry","category-information-science","category-mathematics","category-robotics-ai","category-security"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/222010","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/732"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=222010"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/222010\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=222010"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=222010"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=222010"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}