This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI agents. These insights are informed by Perplexity's experience operating general-purpose agentic systems used by millions of users and thousands of enterprises in both controlled and open-world environments. Agent architectures change core assumptions around code-data separation, authority boundaries, and execution predictability, creating new confidentiality, integrity, and availability failure modes. We map principal attack surfaces across tools, connectors, hosting boundaries, and multi-agent coordination, with particular emphasis on indirect prompt injection, confused-deputy behavior, and cascading failures in long-running workflows. We then assess current defenses as a layered stack: input-level and model-level mitigations, sandboxed execution, and deterministic policy enforcement for high-consequence actions. Finally, we identify standards and research gaps, including adaptive security benchmarks, policy models for delegation and privilege control, and guidance for secure multi-agent system design aligned with NIST risk management principles.
翻译:本文是Perplexity对NIST/CAISI 2025-0035号信息征询函回复的轻微改编版本,详细阐述了我们关于前沿人工智能代理安全性的观察与建议。这些见解源于Perplexity在受控与开放世界环境中运营服务于数百万用户及数千家企业的通用型代理系统的实践经验。代理架构从根本上改变了关于代码-数据分离、权限边界及执行可预测性的核心假设,由此产生了新的机密性、完整性及可用性失效模式。我们梳理了横跨工具、连接器、托管边界及多代理协调的主要攻击面,重点关注间接提示注入、混淆代理行为及长时间运行工作流中的级联故障。继而,我们以分层防护栈为框架评估当前防御手段:输入层与模型层缓解措施、沙箱化执行,以及针对高后果行动实施的确定性策略强制。最后,我们识别出标准与研究空白,包括自适应安全基准测试、适用于授权委托与权限控制的策略模型,以及遵循NIST风险管理原则的安全多代理系统设计指南。