The Controllability Trap: A Governance Framework for Military AI Agents

Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by existing safety frameworks. We identify six agentic governance failures tied to these capabilities and show how they erode meaningful human control in military settings. We propose the Agentic Military AI Governance Framework (AMAGF), a measurable architecture structured around three pillars: Preventive Governance (reducing failure likelihood), Detective Governance (real-time detection of control degradation), and Corrective Governance (restoring or safely degrading operations). Its core mechanism, the Control Quality Score (CQS), is a composite real-time metric quantifying human control and enabling graduated responses as control weakens. For each failure type, we define concrete mechanisms, assign responsibilities across five institutional actors, and formalize evaluation metrics. A worked operational scenario illustrates implementation, and we situate the framework within established agent safety literature. We argue that governance must move from a binary conception of control to a continuous model in which control quality is actively measured and managed throughout the operational lifecycle.

翻译：智能体AI系统——具备目标解读、世界建模、规划、工具使用、长时程操作与自主协同能力——会引发现有安全框架未能涵盖的特定控制失效问题。我们识别出与这些能力相关的六类智能体治理失效模式，并阐明它们如何侵蚀军事场景中有意义的人类控制。本文提出智能体军事AI治理框架（AMAGF），该可量化架构围绕三大支柱构建：预防性治理（降低失效概率）、侦测性治理（实时识别控制退化）与矫正性治理（恢复运行或实现安全降级）。其核心机制——控制质量评分（CQS）——是一个复合实时度量指标，可量化人类控制水平，并在控制弱化时启动分级响应机制。针对每类失效模式，我们定义了具体机制，划分了五类制度主体的责任，并形式化了评估指标。通过典型作战场景推演说明实施流程，并将本框架置于现有智能体安全研究谱系中。我们认为治理范式必须从二元控制观念转向连续控制模型，即在作战全生命周期中持续度量并动态管理控制质量。

相关内容

关注 7104

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《军用自主人工智能系统的治理与安全》

专知会员服务

11+阅读 · 4月21日

《分布式军事人工智能理论：部分可观测与通信条件下的协调约束多智能体强化学习》

专知会员服务

16+阅读 · 4月18日

《军用AI智能体的治理框架》最新报告

专知会员服务

33+阅读 · 3月8日

《任务中心化指标：提升国防行动中人工智能系统的可靠性与稳健性》最新报告

专知会员服务

19+阅读 · 2月22日