{"id":216192,"date":"2025-06-18T13:06:31","date_gmt":"2025-06-18T18:06:31","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2025\/06\/how-can-we-tell-if-ai-is-lying-new-method-tests-whether-ai-explanations-are-truthful"},"modified":"2025-06-18T13:06:31","modified_gmt":"2025-06-18T18:06:31","slug":"how-can-we-tell-if-ai-is-lying-new-method-tests-whether-ai-explanations-are-truthful","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2025\/06\/how-can-we-tell-if-ai-is-lying-new-method-tests-whether-ai-explanations-are-truthful","title":{"rendered":"How can we tell if AI is lying? New method tests whether AI explanations are truthful"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/how-can-we-tell-if-ai-is-lying-new-method-tests-whether-ai-explanations-are-truthful2.jpg\"><\/a><\/p>\n<p>Given the recent explosion of large language models (LLMs) that can make convincingly human-like statements, it makes sense that there\u2019s been a deepened focus on developing the models to be able to explain how they make decisions. But how can we be sure that what they\u2019re saying is the truth?<\/p>\n<p>In a <a href=\"https:\/\/openreview.net\/forum?id=4ub9gpx9xw\" target=\"_blank\">new paper<\/a>, researchers from Microsoft and MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) propose a novel method for measuring LLM explanations with respect to their \u201cfaithfulness\u201d\u2014that is, how accurately an explanation represents the reasoning process behind the model\u2019s answer.<\/p>\n<p>As lead author and Ph.D. student Katie Matton explains, faithfulness is no minor concern: if an LLM produces explanations that are plausible but unfaithful, users might develop false confidence in its responses and fail to recognize when recommendations are misaligned with their own values, like avoiding bias in hiring.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Given the recent explosion of large language models (LLMs) that can make convincingly human-like statements, it makes sense that there\u2019s been a deepened focus on developing the models to be able to explain how they make decisions. But how can we be sure that what they\u2019re saying is the truth? In a new paper, researchers [\u2026]<\/p>\n","protected":false},"author":707,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-216192","post","type-post","status-publish","format-standard","hentry","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/216192","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/707"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=216192"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/216192\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=216192"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=216192"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=216192"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}