Large Language Models (LLMs) often exhibit pronounced context-dependent variability that undermines predictable multi-agent behavior in tasks requiring strategic thinking. Focusing on models that range from 7 to 9 billion parameters in size engaged in a ten-round repeated Prisoner's Dilemma, we evaluate whether short, costless pre-play messages emulating the cheap-talk paradigm affect strategic stability. Our analysis uses simulation-level bootstrap resampling and nonparametric inference to compare cooperation trajectories fitted with LOWESS regression across both the messaging and the no-messaging condition. We demonstrate consistent reductions in trajectory noise across a majority of the model-context pairings being studied. The stabilizing effect persists across multiple prompt variants and decoding regimes, though its magnitude depends on model choice and contextual framing, with models displaying higher baseline volatility gaining the most. While communication rarely produces harmful instability, we document a few context-specific exceptions and identify the limited domains in which communication harms stability. These findings position cheap-talk style communication as a low-cost, practical tool for improving the predictability and reliability of strategic behavior in multi-agent LLM systems.
翻译:大语言模型(LLMs)常表现出显著的上下文依赖性变异,这削弱了在需要策略思维的任务中可预测的多智能体行为。本研究聚焦于参数量在70亿至90亿之间、参与十轮重复囚徒困境博弈的模型,评估模仿廉价磋商范式的简短、无成本预博弈消息是否影响策略稳定性。我们采用模拟水平的自助重采样和非参数推断方法,比较了在消息传递和无消息传递两种条件下通过LOWESS回归拟合的合作轨迹。结果表明,在大多数研究的模型-上下文配对中,轨迹噪声持续降低。这种稳定效应在多种提示变体和解码机制中均持续存在,但其强度取决于模型选择和上下文框架,基线波动性较高的模型获益最大。虽然通信很少产生有害的不稳定性,但我们记录了一些特定上下文的例外情况,并识别了通信损害稳定性的有限领域。这些发现将廉价磋商式通信定位为一种低成本、实用的工具,可用于提高多智能体LLM系统中策略行为的可预测性和可靠性。