As AI agents increasingly operate in complex environments, ensuring reliable, context-aware privacy is critical for regulatory compliance. Traditional access controls are insufficient because privacy risks often arise after access is granted; agents may inadvertently compromise privacy during reasoning by messaging humans, leaking context to peers, or executing unsafe tool calls. Existing approaches typically treat privacy as a binary constraint, overlooking nuanced, computation-dependent requirements. Furthermore, Large Language Model (LLM) agents are inherently probabilistic, lacking formal guarantees for security-critical operations. To address this, we introduce AgentCrypt, a three-tiered framework for secure agent communication that adds a deterministic protection layer atop any AI platform. AgentCrypt spans the full spectrum of privacy needs: from unrestricted data exchange (Level 1), to context-aware masking (Level 2), up to fully encrypted computation using Homomorphic Encryption (Level 3). Unlike prompt-based defenses, our approach guarantees that tagged data privacy is strictly preserved even when the underlying model errs. Security is decoupled from the agent's probabilistic reasoning, ensuring sensitive data remains protected throughout the computational lifecycle. AgentCrypt enables collaborative computation on otherwise inaccessible data, overcoming barriers like data silos. We implemented and validated it using LangGraph and Google ADK, demonstrating versatility across architectures. Finally, we introduce a benchmark dataset simulating privacy-critical tasks to enable systematic evaluation and foster the development of trustworthy, regulatable machine learning systems.
翻译:随着AI智能体在复杂环境中日益广泛地运作,确保可靠、情境感知的隐私保护对于法规遵从至关重要。传统的访问控制机制存在不足,因为隐私风险通常在访问授权后产生;智能体在推理过程中可能因向人类发送消息、向同伴泄露上下文或执行不安全的工具调用而无意中损害隐私。现有方法通常将隐私视为二元约束,忽视了细致且依赖于计算的差异化需求。此外,大型语言模型(LLM)智能体本质上是概率性的,缺乏对安全关键操作的形式化保证。为此,我们提出了AgentCrypt——一个用于安全智能体通信的三层框架,可在任何AI平台之上添加确定性保护层。AgentCrypt覆盖了隐私需求的完整谱系:从无限制的数据交换(第1级),到情境感知的掩蔽处理(第2级),直至使用同态加密的完全加密计算(第3级)。与基于提示的防御机制不同,我们的方法能保证即使底层模型出错,被标记数据的隐私仍能得到严格保护。安全机制与智能体的概率推理过程解耦,确保敏感数据在整个计算生命周期中持续受保护。AgentCrypt使得对原本无法访问的数据进行协同计算成为可能,从而克服了数据孤岛等障碍。我们使用LangGraph和Google ADK实现并验证了该系统,展示了其跨架构的通用性。最后,我们引入了一个模拟隐私关键任务的基准数据集,以支持系统性评估并推动可信、可监管的机器学习系统的发展。