AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises

Today's leading AI models engage in sophisticated behaviour when placed in strategic competition. They spontaneously attempt deception, signaling intentions they do not intend to follow; they demonstrate rich theory of mind, reasoning about adversary beliefs and anticipating their actions; and they exhibit credible metacognitive self-awareness, assessing their own strategic abilities before deciding how to act. Here we present findings from a crisis simulation in which three frontier large language models (GPT-5.2, Claude Sonnet 4, Gemini 3 Flash) play opposing leaders in a nuclear crisis. Our simulation has direct application for national security professionals, but also, via its insights into AI reasoning under uncertainty, has applications far beyond international crisis decision-making. Our findings both validate and challenge central tenets of strategic theory. We find support for Schelling's ideas about commitment, Kahn's escalation framework, and Jervis's work on misperception, inter alia. Yet we also find that the nuclear taboo is no impediment to nuclear escalation by our models; that strategic nuclear attack, while rare, does occur; that threats more often provoke counter-escalation than compliance; that high mutual credibility accelerated rather than deterred conflict; and that no model ever chose accommodation or withdrawal even when under acute pressure, only reduced levels of violence. We argue that AI simulation represents a powerful tool for strategic analysis, but only if properly calibrated against known patterns of human reasoning. Understanding how frontier models do and do not imitate human strategic logic is essential preparation for a world in which AI increasingly shapes strategic outcomes.

翻译：当今领先的人工智能模型在战略竞争环境中展现出复杂行为。它们会自发尝试欺骗，发出无意遵循的意图信号；表现出丰富的心理理论能力，能够推理对手信念并预判其行动；并展现出可信的元认知自我意识，在决定行动前评估自身战略能力。本文通过危机模拟实验，展示了三种前沿大语言模型（GPT-5.2、Claude Sonnet 4、Gemini 3 Flash）在核危机中扮演对立领导者的研究结果。该模拟不仅对国家安全专业人员具有直接应用价值，更通过揭示人工智能在不确定性下的推理机制，其应用范围远超国际危机决策领域。我们的研究发现既验证也挑战了战略理论的核心原则。研究支持了谢林的承诺理论、卡恩的升级框架以及杰维斯的误判研究等经典理论。然而我们也发现：核禁忌并未阻止模型选择核升级；战略性核攻击虽罕见但确实会发生；威胁更常引发对抗性升级而非妥协；高相互可信度反而加速而非遏制冲突；所有模型即使在极端压力下也从未选择妥协或撤退，仅会降低暴力等级。我们认为人工智能模拟是战略分析的有力工具，但前提是必须依据已知的人类推理模式进行适当校准。理解前沿模型如何模仿及偏离人类战略逻辑，对于迎接人工智能日益影响战略格局的世界至关重要。