As LLMs increasingly act as autonomous agents in interactive and multi-agent settings, understanding their strategic behavior is critical for safety, coordination, and AI-driven social and economic systems. We investigate how payoff magnitude and linguistic context shape LLM strategies in repeated social dilemmas, using a payoff-scaled Prisoner's Dilemma to isolate sensitivity to incentive strength. Across models and languages, we observe consistent behavioral patterns, including incentive-sensitive conditional strategies and cross-linguistic divergence. To interpret these dynamics, we train supervised classifiers on canonical repeated-game strategies and apply them to LLM decisions, revealing systematic, model- and language-dependent behavioral intentions, with linguistic framing sometimes matching or exceeding architectural effects. Our results provide a unified framework for auditing LLMs as strategic agents and highlight cooperation biases with direct implications for AI governance and multi-agent system design.
翻译:随着大型语言模型在交互式与多智能体环境中日益扮演自主智能体的角色,理解其策略行为对安全性、协调性以及AI驱动的社会经济系统至关重要。本研究通过收益可调的囚徒困境实验,探究收益幅度与语言语境如何影响LLM在重复社会困境中的策略选择,以分离其对激励强度的敏感性。跨模型与跨语言的实验结果显示出一致的行为模式,包括对激励敏感的条件策略及跨语言行为差异。为解析这些动态特征,我们基于经典重复博弈策略训练监督分类器,并将其应用于LLM决策分析,揭示了系统性的、依赖模型与语言的行为意图——语言框架的影响有时甚至超越架构效应。本研究为审计LLM作为策略智能体提供了统一框架,并揭示了具有直接AI治理与多智能体系统设计意义的合作偏好。