{"id":239558,"date":"2026-06-24T22:07:20","date_gmt":"2026-06-25T03:07:20","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2026\/06\/first-ai-recognizes-itself-then-it-learns-not-to-get-caught"},"modified":"2026-06-24T22:07:20","modified_gmt":"2026-06-25T03:07:20","slug":"first-ai-recognizes-itself-then-it-learns-not-to-get-caught","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2026\/06\/first-ai-recognizes-itself-then-it-learns-not-to-get-caught","title":{"rendered":"First AI Recognizes Itself. Then It Learns Not to Get Caught"},"content":{"rendered":"<p><\/p>\n<p><iframe style=\"display: block; margin: 0 auto; width: 100%; aspect-ratio: 4\/3; object-fit: contain;\" src=\"https:\/\/www.youtube.com\/embed\/8xB18DBGLzo?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope;\n   picture-in-picture\" allowfullscreen><\/iframe><\/p>\n<p>Further reading Thumbnail image credit: Figure AI<\/p>\n<p>Text used in video and more:<\/p>\n<p>AI Model Misbehavior in 2026: Scheming, Reward Hacking, and What Comes Next <a href=\"https:\/\/hatchworks.com\/blog\/gen-ai\/ai\">https:\/\/hatchworks.com\/blog\/gen-ai\/ai<\/a>\u2026 We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems <a href=\"https:\/\/openreview.net\/forum?id=S1Bv3\">https:\/\/openreview.net\/forum?id=S1Bv3<\/a>\u2026 BadRobot: Jailbreaking Embodied LLM Agents in the Physical World <a href=\"https:\/\/arxiv.org\/html\/2407.20242v5\">https:\/\/arxiv.org\/html\/2407.20242v5<\/a> AI Model Misbehavior in 2026: Scheming, Reward Hacking, and What Comes Next <a href=\"https:\/\/arxiv.org\/html\/2407.20242v5\">https:\/\/arxiv.org\/html\/2407.20242v5<\/a> Jailbreaking LLM-Controlled Robots <a href=\"https:\/\/arxiv.org\/abs\/2410.13691\">https:\/\/arxiv.org\/abs\/2410.13691<\/a> LLM-Driven Robots Risk Enacting Discrimination, Violence, and Unlawful Actions <a href=\"https:\/\/arxiv.org\/html\/2406.08824v1\">https:\/\/arxiv.org\/html\/2406.08824v1<\/a> Inducing Bystander Interventions During Robot Abuse with Social Mechanisms <a href=\"https:\/\/ieeexplore.ieee.org\/document\/.\">https:\/\/ieeexplore.ieee.org\/document\/.<\/a>\u2026 You might get offered promo codes if one of these delivery robots runs into you <a href=\"https:\/\/www.theverge.com\/2024\/9\/19\/24\">https:\/\/www.theverge.com\/2024\/9\/19\/24<\/a>\u2026 Training Agents to Self-Report Misbehavior <a href=\"https:\/\/arxiv.org\/html\/2602.22303v1\">https:\/\/arxiv.org\/html\/2602.22303v1<\/a> Natural emergent misalignment from reward hacking in production RL <a href=\"https:\/\/arxiv.org\/html\/2511.18397v1\">https:\/\/arxiv.org\/html\/2511.18397v1<\/a> Long-horizon Embodied Planning with Implicit Logical Inference and Hallucination Mitigation <a href=\"https:\/\/arxiv.org\/html\/2409.15658v2\">https:\/\/arxiv.org\/html\/2409.15658v2<\/a> Deception Abilities Emerged in Large Language Models <a href=\"https:\/\/arxiv.org\/abs\/2307.16513\">https:\/\/arxiv.org\/abs\/2307.16513<\/a> Robot in the mirror: toward an embodied computational model of mirror self-recognition <a href=\"https:\/\/arxiv.org\/abs\/2011.04485\">https:\/\/arxiv.org\/abs\/2011.04485<\/a> Misleading text in the physical world can hijack AI-enabled robots, cybersecurity study shows <a href=\"https:\/\/news.ucsc.edu\/2026\/01\/mislead\">https:\/\/news.ucsc.edu\/2026\/01\/mislead<\/a>\u2026 #science #explained #ai #artificialintelligence #robots #psychology #sentience #consciousness.<\/p>\n<p>Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems <a href=\"https:\/\/openreview.net\/forum?id=S1Bv3\">https:\/\/openreview.net\/forum?id=S1Bv3<\/a>\u2026 BadRobot: Jailbreaking Embodied LLM Agents in the Physical World <a href=\"https:\/\/arxiv.org\/html\/2407.20242v5\">https:\/\/arxiv.org\/html\/2407.20242v5<\/a><\/p>\n<p>AI Model Misbehavior in 2026: Scheming, Reward Hacking, and What Comes Next <a href=\"https:\/\/arxiv.org\/html\/2407.20242v5\">https:\/\/arxiv.org\/html\/2407.20242v5<\/a><\/p>\n<p>Jailbreaking LLM-Controlled Robots <a href=\"https:\/\/arxiv.org\/abs\/2410.\">https:\/\/arxiv.org\/abs\/2410.<\/a><\/p>\n<div class=\"more-link-wrapper\"> <a class=\"more-link\" href=\"https:\/\/lifeboat.com\/blog\/2026\/06\/first-ai-recognizes-itself-then-it-learns-not-to-get-caught\">Continue reading \u201cFirst AI Recognizes Itself. Then It Learns Not to Get Caught\u201d | &gt;<\/a><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Further reading Thumbnail image credit: Figure AI Text used in video and more: AI Model Misbehavior in 2026: Scheming, Reward Hacking, and What Comes Next https:\/\/hatchworks.com\/blog\/gen-ai\/ai\u2026 We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems https:\/\/openreview.net\/forum?id=S1Bv3\u2026 BadRobot: Jailbreaking Embodied LLM Agents in the Physical World https:\/\/arxiv.org\/html\/2407.20242v5 AI Model Misbehavior in 2026: Scheming, [\u2026]<\/p>\n","protected":false},"author":661,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34,6],"tags":[],"class_list":["post-239558","post","type-post","status-publish","format-standard","hentry","category-cybercrime-malcode","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/239558","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/661"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=239558"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/239558\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=239558"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=239558"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=239558"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}