A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding conversation. AI agents build such conversations automatically: given an initial human prompt, a planner loop interleaves LLM calls with tool invocations and code execution. This tight coupling creates a new and poorly understood attack surface. A malicious prompt injected into a conversation can compromise later reasoning, trigger dangerous tool calls, or distort final outputs. Despite the centrality of such systems, we currently lack a principled semantic foundation for reasoning about their behaviour and safety. We address this gap by introducing an untyped call-by-value lambda calculus enriched with dynamic information-flow control and a small number of primitives for constructing prompt-response conversations. Our language includes a primitive that invokes an LLM: it serializes a value, sends it to the model as a prompt, and parses the response as a new term. This calculus faithfully represents planner loops and their vulnerabilities, including the mechanisms by which prompt injection alters subsequent computation. The semantics explicitly captures conversations, and so supports reasoning about defenses such as quarantined sub-conversations, isolation of generated code, and information-flow restrictions on what may influence an LLM call. A termination-insensitive noninterference theorem establishes integrity and confidentiality guarantees, demonstrating that a formal calculus can provide rigorous foundations for safe agentic programming.
翻译:与大型语言模型(LLM)的对话是由一系列提示与响应构成的序列,其中每个响应均基于先前的对话内容生成。AI智能体可自动构建此类对话:给定初始的人类提示后,规划器循环将LLM调用与工具调用及代码执行交错进行。这种紧密耦合形成了全新且尚未被充分理解的攻击面。注入对话的恶意提示可能破坏后续推理、触发危险工具调用或扭曲最终输出。尽管此类系统具有核心重要性,目前我们仍缺乏用于推演其行为与安全性的原则性语义基础。为填补这一空白,本文提出一种通过动态信息流控制增强的无类型按值调用lambda演算,并引入少量用于构建提示-响应对话的原语。该语言包含调用LLM的原语:其将值序列化后作为提示发送至模型,并将响应解析为新项。此演算可准确表征规划器循环及其脆弱性,包括提示注入改变后续计算的机制。语义显式捕获对话过程,从而支持对防御措施的推演,例如隔离子对话、隔离生成代码以及对可能影响LLM调用的信息流施加限制。通过终止不敏感的非干涉定理,我们建立了完整性与保密性保障,证明形式化演算能为安全智能体编程提供严谨的理论基础。