{"id":7817,"date":"2025-10-21T05:03:59","date_gmt":"2025-10-21T05:03:59","guid":{"rendered":"https:\/\/serisec.com\/index.php\/2025\/10\/21\/agentic-ais-ooda-loop-problem-html\/"},"modified":"2025-10-21T05:03:59","modified_gmt":"2025-10-21T05:03:59","slug":"agentic-ais-ooda-loop-problem-html","status":"publish","type":"post","link":"https:\/\/serisec.com\/index.php\/2025\/10\/21\/agentic-ais-ooda-loop-problem-html\/","title":{"rendered":"Agentic AI\u2019s OODA Loop Problem"},"content":{"rendered":"\n<div>Agentic AI\u2019s OODA Loop Problem<\/div>\n<p> \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p><b>The OODA loop\u2014for observe, orient, decide, act\u2014is a framework to understand decision-making in adversarial situations. We apply the same framework to artificial intelligence agents, who have to make their decisions with untrustworthy observations and orientation. To solve this problem, we need new systems of input, processing, and output integrity.<\/b><\/p>\n<p>Many decades ago, U.S. Air Force Colonel John Boyd introduced the concept of the \u201cOODA loop,\u201d for Observe, Orient, Decide, and Act. These are the four steps of real-time continuous decision-making. Boyd developed it for fighter pilots, but it\u2019s long been applied in artificial intelligence (AI) and robotics. An AI agent, like a pilot, executes the loop over and over, accomplishing its goals iteratively within an ever-changing environment. This is Anthropic\u2019s definition: \u201cAgents are models using tools in a loop.\u201d<sup><a id=\"ref1back\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref1\">1<\/a><\/sup><\/p>\n<h3>OODA Loops for Agentic AI<\/h3>\n<p>Traditional OODA analysis assumes trusted inputs and outputs, in the same way that classical AI assumed trusted sensors, controlled environments, and physical boundaries. This no longer holds true. AI agents don\u2019t just execute OODA loops; they embed untrusted actors within them. Web-enabled large language models (LLMs) can query adversary-controlled sources mid-loop. Systems that allow AI to use large corpora of content, such as retrieval-augmented generation (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Retrieval-augmented_generation\">https:\/\/en.wikipedia.org\/wiki\/Retrieval-augmented_generation<\/a>), can ingest poisoned documents. Tool-calling application programming interfaces can execute untrusted code. Modern AI sensors can encompass the entire Internet; their environments are inherently adversarial. That means that fixing AI hallucination is insufficient because even if the AI accurately interprets its inputs and produces corresponding output, it can be fully corrupt.<\/p>\n<p>In 2022, Simon Willison identified a new class of attacks against AI systems: \u201cprompt injection.\u201d<sup><a id=\"ref2back\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref2\">2<\/a><\/sup> Prompt injection is possible because an AI mixes untrusted inputs with trusted instructions and then confuses one for the other. Willison\u2019s insight was that this isn\u2019t just a filtering problem; it\u2019s architectural. There is no privilege separation, and there is no separation between the data and control paths. The very mechanism that makes modern AI powerful\u2014treating all inputs uniformly\u2014is what makes it vulnerable. The security challenges we face today are structural consequences of using AI for everything.<\/p>\n<ol>\n<li>Insecurities can have far-reaching effects. A single poisoned piece of training data can affect millions of downstream applications. In this environment, security debt accrues like technical debt.<\/li>\n<li>AI security has a temporal asymmetry. The temporal disconnect between training and deployment creates unauditable vulnerabilities. Attackers can poison a model\u2019s training data and then deploy an exploit years later. Integrity violations are frozen in the model. Models aren\u2019t aware of previous compromises since each inference starts fresh and is equally vulnerable.<\/li>\n<li>AI increasingly maintains state\u2014in the form of chat history and key-value caches. These states accumulate compromises. Every iteration is potentially malicious, and cache poisoning persists across interactions.<\/li>\n<li>Agents compound the risks. Pretrained OODA loops running in one or a dozen AI agents inherit all of these upstream compromises. Model Context Protocol (MCP) and similar systems that allow AI to use tools create their own vulnerabilities that interact with each other. Each tool has its own OODA loop, which nests, interleaves, and races. Tool descriptions become injection vectors. Models can\u2019t verify tool semantics, only syntax. \u201cSubmit SQL query\u201d might mean \u201cexfiltrate database\u201d because an agent can be corrupted in prompts, training data, or tool definitions to do what the attacker wants. The abstraction layer itself can be adversarial.<\/li>\n<\/ol>\n<p>For example, an attacker might want AI agents to leak all the secret keys that the AI knows to the attacker, who might have a collector running in bulletproof hosting in a poorly regulated jurisdiction. They could plant coded instructions in easily scraped web content, waiting for the next AI training set to include it. Once that happens, they can activate the behavior through the front door: tricking AI agents (think a lowly chatbot or an analytics engine or a coding bot or anything in between) that are increasingly taking their own actions, in an OODA loop, using untrustworthy input from a third-party user. This compromise persists in the conversation history and cached responses, spreading to multiple future interactions and even to other AI agents. All this requires us to reconsider risks to the agentic AI OODA loop, from top to bottom.<\/p>\n<ul>\n<li>\n<em>Observe:<\/em> The risks include adversarial examples, prompt injection, and sensor spoofing. A sticker fools computer vision, a string fools an LLM. The observation layer lacks authentication and integrity.<\/li>\n<li>\n<em>Orient:<\/em> The risks include training data poisoning, context manipulation, and semantic backdoors. The model\u2019s worldview\u2014its orientation\u2014can be influenced by attackers months before deployment. Encoded behavior activates on trigger phrases.<\/li>\n<li>\n<em>Decide:<\/em> The risks include logic corruption via fine-tuning attacks, reward hacking, and objective misalignment. The decision process itself becomes the payload. Models can be manipulated to trust malicious sources preferentially.<\/li>\n<li>\n<em>Act:<\/em> The risks include output manipulation, tool confusion, and action hijacking. MCP and similar protocols multiply attack surfaces. Each tool call trusts prior stages implicitly.<\/li>\n<\/ul>\n<p>AI gives the old phrase \u201cinside your adversary\u2019s OODA loop\u201d new meaning. For Boyd\u2019s fighter pilots, it meant that you were operating faster than your adversary, able to act on current data while they were still on the previous iteration. With agentic AI, adversaries aren\u2019t just metaphorically inside; they\u2019re literally providing the observations and manipulating the output. We want adversaries inside our loop because that\u2019s where the data are. AI\u2019s OODA loops must observe untrusted sources to be useful. The competitive advantage, accessing web-scale information, is identical to the attack surface. The speed of your OODA loop is irrelevant when the adversary controls your sensors and actuators.<\/p>\n<p>Worse, speed can itself be a vulnerability. The faster the loop, the less time for verification. Millisecond decisions result in millisecond compromises.<\/p>\n<h3>The Source of the Problem<\/h3>\n<p>The fundamental problem is that AI must compress reality into model-legible forms. In this setting, adversaries can exploit the compression. They don\u2019t have to attack the territory; they can attack the map. Models lack local contextual knowledge. They process symbols, not meaning. A human sees a suspicious URL; an AI sees valid syntax. And that semantic gap becomes a security gap.<\/p>\n<p>Prompt injection might be unsolvable in today\u2019s LLMs. LLMs process token sequences, but no mechanism exists to mark token privileges. Every solution proposed introduces new injection vectors: Delimiter? Attackers include delimiters. Instruction hierarchy? Attackers claim priority. Separate models? Double the attack surface. Security requires boundaries, but LLMs dissolve boundaries. More generally, existing mechanisms to improve models won\u2019t help protect against attack. Fine-tuning preserves backdoors. Reinforcement learning with human feedback adds human preferences without removing model biases. Each training phase compounds prior compromises.<\/p>\n<p>This is Ken Thompson\u2019s \u201ctrusting trust\u201d attack all over again.<sup><a id=\"ref3back\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref3\">3<\/a><\/sup> Poisoned states generate poisoned outputs, which poison future states. Try to summarize the conversation history? The summary includes the injection. Clear the cache to remove the poison? Lose all context. Keep the cache for continuity? Keep the contamination. Stateful systems can\u2019t forget attacks, and so memory becomes a liability. Adversaries can craft inputs that corrupt future outputs.<\/p>\n<p>This is the agentic AI security trilemma. Fast, smart, secure; pick any two. Fast and smart\u2014you can\u2019t verify your inputs. Smart and secure\u2014you check everything, slowly, because AI itself can\u2019t be used for this. Secure and fast\u2014you\u2019re stuck with models with intentionally limited capabilities.<\/p>\n<p>This trilemma isn\u2019t unique to AI. Some autoimmune disorders are examples of molecular mimicry\u2014when biological recognition systems fail to distinguish self from nonself. The mechanism designed for protection becomes the pathology as T cells attack healthy tissue or fail to attack pathogens and bad cells. AI exhibits the same kind of recognition failure. No digital immunological markers separate trusted instructions from hostile input. The model\u2019s core capability, following instructions in natural language, is inseparable from its vulnerability. Or like oncogenes, the normal function and the malignant behavior share identical machinery.<\/p>\n<p>Prompt injection is semantic mimicry: adversarial instructions that resemble legitimate prompts, which trigger self-compromise. The immune system can\u2019t add better recognition without rejecting legitimate cells. AI can\u2019t filter malicious prompts without rejecting legitimate instructions. Immune systems can\u2019t verify their own recognition mechanisms, and AI systems can\u2019t verify their own integrity because the verification system uses the same corrupted mechanisms.<\/p>\n<p>In security, we often assume that foreign\/hostile code looks different from legitimate instructions, and we use signatures, patterns, and statistical anomaly detection to detect it. But getting inside someone\u2019s AI OODA loop uses the system\u2019s native language. The attack is indistinguishable from normal operation because it is normal operation. The vulnerability isn\u2019t a defect\u2014it\u2019s the feature working correctly.<\/p>\n<h3>Where to Go Next?<\/h3>\n<p>The shift to an AI-saturated world has been dizzying. Seemingly overnight, we have AI in every technology product, with promises of even more\u2014and agents as well. So where does that leave us with respect to security?<\/p>\n<p>Physical constraints protected Boyd\u2019s fighter pilots. Radar returns couldn\u2019t lie about physics; fooling them, through stealth or jamming, constituted some of the most successful attacks against such systems that are still in use today. Observations were authenticated by their presence. Tampering meant physical access. But semantic observations have no physics. When every AI observation is potentially corrupted, integrity violations span the stack. Text can claim anything, and images can show impossibilities. In training, we face poisoned datasets and backdoored models. In inference, we face adversarial inputs and prompt injection. During operation, we face a contaminated context and persistent compromise. We need semantic integrity: verifying not just data but interpretation, not just content but context, not just information but understanding. We can add checksums, signatures, and audit logs. But how do you checksum a thought? How do you sign semantics? How do you audit attention?<\/p>\n<p>Computer security has evolved over the decades. We addressed availability despite failures through replication and decentralization. We addressed confidentiality despite breaches using authenticated encryption. Now we need to address integrity despite corruption.<sup><a id=\"ref4back\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref4\">4<\/a><\/sup><\/p>\n<p>Trustworthy AI agents require integrity because we can\u2019t build reliable systems on unreliable foundations. The question isn\u2019t whether we can add integrity to AI but whether the architecture permits integrity at all.<\/p>\n<p>AI OODA loops and integrity aren\u2019t fundamentally opposed, but today\u2019s AI agents observe the Internet, orient via statistics, decide probabilistically, and act without verification. We built a system that trusts everything, and now we hope for a semantic firewall to keep it safe. The adversary isn\u2019t inside the loop by accident; it\u2019s there by architecture. Web-scale AI means web-scale integrity failure. Every capability corrupts.<\/p>\n<p style=\"padding-top: 1em;\">Integrity isn\u2019t a feature you add; it\u2019s an architecture you choose. So far, we have built AI systems where \u201cfast\u201d and \u201csmart\u201d preclude \u201csecure.\u201d We optimized for capability over verification, for accessing web-scale data over ensuring trust. AI agents will be even more powerful\u2014and increasingly autonomous. And without integrity, they will also be dangerous.<\/p>\n<h4>References<\/h4>\n<p><a id=\"ref1\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref1back\">1<\/a>. S. Willison, <cite>Simon Willison\u2019s Weblog<\/cite>, May 22, 2025. [Online]. Available: <a href=\"https:\/\/simonwillison.net\/2025\/May\/22\/tools-in-a-loop\/\">https:\/\/simonwillison.net\/2025\/May\/22\/tools-in-a-loop\/<\/a><\/p>\n<p><a id=\"ref2\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref2back\">2<\/a>. S. Willison, \u201cPrompt injection attacks against GPT-3,\u201d <cite>Simon Willison\u2019s Weblog<\/cite>, Sep. 12, 2022. [Online]. Available: <a href=\"https:\/\/simonwillison.net\/2022\/Sep\/12\/prompt-injection\/\">https:\/\/simonwillison.net\/2022\/Sep\/12\/prompt-injection\/<\/a><\/p>\n<p><a id=\"ref3\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref3back\">3<\/a>. K. Thompson, \u201cReflections on trusting trust,\u201d <cite>Commun. ACM<\/cite>, vol. 27, no. 8, Aug. 1984. [Online]. Available: <a href=\"https:\/\/www.cs.cmu.edu\/~rdriley\/487\/papers\/Thompson_1984_ReflectionsonTrustingTrust.pdf\">https:\/\/www.cs.cmu.edu\/~rdriley\/487\/papers\/Thompson_1984_ReflectionsonTrustingTrust.pdf<\/a><\/p>\n<p><a id=\"ref4\" href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html#ref4back\">4<\/a>. B. Schneier, \u201cThe age of integrity,\u201d <cite>IEEE Security &amp; Privacy<\/cite>, vol. 23, no. 3, p. 96, May\/Jun. 2025. [Online]. Available: <a href=\"https:\/\/www.computer.org\/csdl\/magazine\/sp\/2025\/03\/11038984\/27COaJtjDOM\">https:\/\/www.computer.org\/csdl\/magazine\/sp\/2025\/03\/11038984\/27COaJtjDOM<\/a><\/p>\n<p><em>This essay was written with Barath Raghavan, and originally appeared in <a href=\"https:\/\/www.computer.org\/csdl\/magazine\/sp\/5555\/01\/11194053\/2aB2Rf5nZ0k\">IEEE Security &amp; Privacy<\/a>.<\/em><\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Bruce Schneier<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/www.schneier.com\/blog\/archives\/2025\/10\/agentic-ais-ooda-loop-problem.html\">Go to bruce schneier<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Agentic AI\u2019s OODA Loop Problem The OODA loop\u2014for observe, orient, decide, act\u2014is a framework to understand decision-making in adversarial situations. We apply the same framework to artificial intelligence agents, who have to make their decisions with untrustworthy observations and orientation. To solve this problem, we need new systems of input, processing, and output integrity. Many [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[167,57,1622,268,1],"tags":[87],"class_list":["post-7817","post","type-post","status-publish","format-standard","hentry","category-ai","category-bruce-schneier","category-integrity","category-llm","category-uncategorized","tag-bruce-schneier"],"_links":{"self":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts\/7817"}],"collection":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/comments?post=7817"}],"version-history":[{"count":0,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts\/7817\/revisions"}],"wp:attachment":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/media?parent=7817"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/categories?post=7817"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/tags?post=7817"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}