{"id":231653,"date":"2026-02-19T18:10:12","date_gmt":"2026-02-20T00:10:12","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2026\/02\/exposing-biases-moods-personalities-and-abstract-concepts-hidden-in-large-language-models"},"modified":"2026-02-19T18:10:12","modified_gmt":"2026-02-20T00:10:12","slug":"exposing-biases-moods-personalities-and-abstract-concepts-hidden-in-large-language-models","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2026\/02\/exposing-biases-moods-personalities-and-abstract-concepts-hidden-in-large-language-models","title":{"rendered":"Exposing biases, moods, personalities and abstract concepts hidden in large language models"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/exposing-biases-moods-personalities-and-abstract-concepts-hidden-in-large-language-models2.jpg\"><\/a><\/p>\n<p>Now a team from MIT and the University of California San Diego has developed a way to test whether a large language model (LLM) contains hidden biases, personalities, moods, or other abstract concepts. Their method can zero in on connections within a model that encode for a concept of interest. What\u2019s more, the method can then manipulate, or \u201csteer\u201d these connections, to strengthen or weaken the concept in any answer a model is prompted to give.<\/p>\n<p>The team proved their method could quickly root out and steer more than 500 general concepts in some of the largest LLMs used today. For instance, the researchers could home in on a model\u2019s representations for personalities such as \u201csocial influencer\u201d and \u201cconspiracy theorist,\u201d and stances such as \u201cfear of marriage\u201d and \u201cfan of Boston.\u201d They could then tune these representations to enhance or minimize the concepts in any answers that a model generates.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Now a team from MIT and the University of California San Diego has developed a way to test whether a large language model (LLM) contains hidden biases, personalities, moods, or other abstract concepts. Their method can zero in on connections within a model that encode for a concept of interest. What\u2019s more, the method can [\u2026]<\/p>\n","protected":false},"author":662,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-231653","post","type-post","status-publish","format-standard","hentry","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/231653","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/662"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=231653"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/231653\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=231653"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=231653"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=231653"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}