Judging the safety of an action, whether taken by a human or a system, must take into account the context in which the action takes place. Deleting an email from user's mailbox may or may not be appropriate depending on email's content, user's goals, or even available space. Systems today that make these judgements -- providing security against harmful or inappropriate actions -- rely on manually-crafted policies or user confirmation for each relevant context. With the upcoming deployment of systems like generalist agents, we argue that we must rethink security designs to adapt to the scale of contexts and capabilities of these systems. As a first step, this paper explores contextual security in the domain of agents and proposes contextual security for agents (Conseca), a framework to generate just-in-time, contextual, and human-verifiable security policies.
翻译:判断一个行为(无论是人类还是系统执行的)是否安全,必须考虑该行为发生的具体情境。从用户邮箱中删除一封邮件是否恰当,取决于邮件内容、用户目标,甚至可用存储空间等因素。当前进行此类判断的系统——即针对有害或不恰当行为提供安全防护的系统——依赖于为每个相关情境手动制定策略或寻求用户确认。随着通用智能体等系统的即将部署,我们认为必须重新思考安全设计,以适应这些系统所涉及的情境规模与能力范围。作为初步探索,本文研究了智能体领域的语境化安全问题,并提出了面向智能体的语境化安全框架(Conseca),该框架能够生成即时、情境化且可由人类验证的安全策略。