Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework

AI is moving from domain-specific autonomy in closed, predictable settings to large-language-model-driven agents that plan and act in open, cross-organizational environments. As a result, the cybersecurity risk landscape is changing in fundamental ways. Agentic AI systems can plan, act, collaborate, and persist over time, functioning as participants in complex socio-technical ecosystems rather than as isolated software components. Although recent work has strengthened defenses against model and pipeline level vulnerabilities such as prompt injection, data poisoning, and tool misuse, these system centric approaches may fail to capture risks that arise from autonomy, interaction, and emergent behavior. This article introduces the 4C Framework for multi-agent AI security, inspired by societal governance. It organizes agentic risks across four interdependent dimensions: Core (system, infrastructure, and environmental integrity), Connection (communication, coordination, and trust), Cognition (belief, goal, and reasoning integrity), and Compliance (ethical, legal, and institutional governance). By shifting AI security from a narrow focus on system-centric protection to the broader preservation of behavioral integrity and intent, the framework complements existing AI security strategies and offers a principled foundation for building agentic AI systems that are trustworthy, governable, and aligned with human values.

翻译：人工智能正从封闭可预测环境中的领域特定自主性，向开放跨组织环境中基于大语言模型的智能体转变，这些智能体能够规划行动并持续运作。因此，网络安全风险格局正在发生根本性变革。智能体AI系统具备长期规划、行动、协作与持续运行的能力，其角色已从孤立的软件组件转变为复杂社会技术生态系统中的参与主体。尽管近期研究在防御提示注入、数据投毒及工具误用等模型与管道层面漏洞方面取得了进展，但这些以系统为中心的方法可能无法有效应对由自主性、交互行为与涌现特性引发的风险。本文受社会治理机制启发，提出面向多智能体AI安全的4C框架。该框架将智能体风险归纳为四个相互依存的维度：核心层（系统、基础设施与环境完整性）、连接层（通信、协调与信任机制）、认知层（信念、目标与推理完整性）以及合规层（伦理、法律与制度治理）。通过将AI安全关注点从狭隘的系统中心防护转向更广泛的行为完整性与意图维护，本框架既是对现有AI安全策略的补充，也为构建可信、可控且符合人类价值观的智能体AI系统提供了原则性基础。

相关内容

关注 7107

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

确保国防任务中的人工智能安全：多层次方法

专知会员服务

15+阅读 · 1月21日

智能体化 AI 与网络安全综述：挑战、机遇与用例原型

专知会员服务

29+阅读 · 1月13日

AI 智能体系统：体系架构、应用场景及评估范式

专知会员服务

69+阅读 · 1月6日

中国信通院发布《人工智能风险治理报告（2024年）》

专知会员服务

48+阅读 · 2024年12月26日