AI agents promise to serve as general-purpose personal assistants for their users, which requires them to have access to private user data (e.g., personal and financial information). This poses a serious risk to security and privacy. Adversaries may attack the AI model (e.g., via prompt injection) to exfiltrate user data. Furthermore, sharing private data with an AI agent requires users to trust a potentially unscrupulous or compromised AI model provider with their private data. This paper presents GAAP (Guaranteed Accounting for Agent Privacy), an execution environment for AI agents that guarantees confidentiality for private user data. Through dynamic and directed user prompts, GAAP collects permission specifications from users describing how their private data may be shared, and GAAP enforces that the agent's disclosures of private user data, including disclosures to the AI model and its provider, comply with these specifications. Crucially, GAAP provides this guarantee deterministically, without trusting the agent with private user data, and without requiring any AI model or the user prompt to be free of attacks. GAAP enforces the user's permission specification by tracking how the AI agent accesses and uses private user data. It augments Information Flow Control with novel persistent data stores and annotations that enable it to track the flow of private information both across execution steps within a single task, and also over multiple tasks separated in time. Our evaluation confirms that GAAP blocks all data disclosure attacks, including those that make other state-of-the-art systems disclose private user data to untrusted parties, without a significant impact on agent utility.
翻译:AI代理有望成为用户的通用个人助手,这要求它们能够访问用户的私人数据(如个人信息和财务数据)。这对安全性和隐私构成了严重风险。攻击者可能通过攻击AI模型(例如提示注入)来窃取用户数据。此外,将私人数据分享给AI代理需要用户信任可能不道德或已受损害的AI模型提供商。本文提出GAAP(隐私保证核算),一个为AI代理设计的执行环境,能够保证用户私人数据的机密性。通过动态定向的用户提示,GAAP收集用户关于私人数据如何被分享的权限规范,并强制代理对用户私人数据的披露(包括向AI模型及其提供商的披露)符合这些规范。关键的是,GAAP确定性提供此保证,无需信任代理对用户私人数据的访问,也无需AI模型或用户提示免受攻击。GAAP通过追踪AI代理如何访问和使用用户私人数据来执行用户的权限规范。它用新型持久数据存储和注释增强信息流控制,使其能够在单一任务内的执行步骤之间,以及时间上分离的多任务之间追踪私人信息的流动。我们的评估确认,GAAP能阻断所有数据泄露攻击,包括那些使其他最先进系统向不可信方泄露用户私人数据的攻击,且不会显著影响代理的实用性。