Large Language Model (LLM) agents provide powerful automation capabilities, but they also create a substantially broader attack surface than traditional applications due to their tight integration with non-deterministic models and third-party services. While current deployments primarily rely on cloud-hosted services, emerging designs increasingly execute agents directly on edge devices to reduce latency and enhance user privacy. However, securely hosting such complex agent pipelines on edge devices remains challenging. These deployments must protect proprietary assets (e.g., system prompts and model weights) and sensitive runtime state on heterogeneous platforms that are vulnerable to software attacks and potentially controlled by malicious users. To address these challenges, we present AgenTEE, a system for deploying confidential agent pipelines on edge devices. AgenTEE places the agent runtime, inference engine, and third-party applications into independently attested confidential virtual machines (cVMs) and mediates their interaction through explicit, verifiable communication channels. Built on Arm Confidential Compute Architecture (CCA), a recent extension to Arm platforms, AgenTEE enforces strong system-level isolation of sensitive assets and runtime state. Our evaluation shows that such multi-cVMs system is practical, achieving near-native performance with less than 5.15% runtime overhead compared to commodity OS multi-process deployments.
翻译:大型语言模型(LLM)代理提供了强大的自动化能力,但由于其与非确定性模型及第三方服务的紧密集成,它们相比传统应用创造了更广泛的攻击面。尽管当前部署主要依赖云托管服务,新兴设计越来越多地直接在边缘设备上执行代理,以降低延迟并增强用户隐私。然而,在边缘设备上安全托管如此复杂的代理流水线仍具挑战性。这些部署必须在异构平台上保护专有资产(如系统提示和模型权重)及敏感的运行时状态,而这些平台易受软件攻击且可能被恶意用户控制。为应对这些挑战,我们提出AgenTEE,一种在边缘设备上部署机密代理流水线的系统。AgenTEE将代理运行时、推理引擎和第三方应用置于独立证明的机密虚拟机(cVM)中,并通过显式、可验证的通信通道调节它们之间的交互。该系统基于Arm机密计算架构(CCA)(Arm平台的最新扩展)构建,对敏感资产和运行时状态实施强系统级隔离。我们的评估表明,这种多cVM系统是可行的,与商用操作系统多进程部署相比,其运行时开销低于5.15%,性能接近原生水平。