Agentic systems built on large language models (LLMs) extend beyond text generation to autonomously retrieve information and invoke tools. This runtime execution model shifts the attack surface from build-time artifacts to inference-time dependencies, exposing agents to manipulation through untrusted data and probabilistic capability resolution. While prior work has focused on model-level vulnerabilities, security risks emerging from cyclic and interdependent runtime behavior remain fragmented. We systematize these risks within a unified runtime framework, categorizing threats into data supply chain attacks (transient context injection and persistent memory poisoning) and tool supply chain attacks (discovery, implementation, and invocation). We further identify the Viral Agent Loop, in which agents act as vectors for self-propagating generative worms without exploiting code-level flaws. Finally, we advocate a Zero-Trust Runtime Architecture that treats context as untrusted control flow and constrains tool execution through cryptographic provenance rather than semantic inference.
翻译:基于大型语言模型(LLM)构建的智能体系统已超越文本生成范畴,能够自主检索信息并调用工具。这种运行时执行模型将攻击面从构建期产物转移至推理期依赖,使智能体面临通过不可信数据和概率性能力解析进行操纵的风险。尽管先前研究主要关注模型层漏洞,但由循环且相互依存的运行时行为引发的安全风险仍呈碎片化。我们在统一的运行时框架内系统化梳理这些风险,将威胁归类为数据供应链攻击(瞬时上下文注入与持久性记忆污染)和工具供应链攻击(发现、实现与调用)。我们进一步识别出"病毒式智能体循环"——在此类攻击中,智能体无需利用代码层缺陷即可成为自传播生成式蠕虫的传播载体。最后,我们提出"零信任运行时架构",该架构将上下文视为不可信控制流,并通过加密溯源而非语义推理来约束工具执行。