Symbolic Reasoning Frameworks Modulate LLM Risk Aversion in Multi-Agent Strategic Settings

Large language models exhibit innate behavioral tendencies when deployed as strategic agents -- notably a risk-averse "turtle" bias toward defensive play. We show that symbolic reasoning frameworks, injected as per-round reflective prompts into one agent, differentially modulate this bias and reshape the multi-agent ecosystem to produce framework-specific winner distributions. In a 7-player Warring States Diplomacy variant (41 games, 4 conditions, single-campaign memory accumulation), each framework produces a distinct ecosystem signature: under control, Yan dominates (7/11, 64%); under I-Ching yarrow divination, Yan and Chu co-dominate while Qin is completely suppressed (0/10); under Tarot, Qin dominates (5/10, Fisher vs. pooled p = 0.006); under scrambled-text ablation (incoherent oracle text preserving prompt structure), Qi dominates (5/10, Fisher vs. pooled p = 0.006). The framework-receiving agent (Han) never wins and shows no survival difference across conditions (Fisher p = 1.0), but Tarot consistently elevates Han's peak territory (mean 3.0 SCs vs. 2.1-2.5 others, Kruskal-Wallis p = 0.010). Neither framework's content predicts subsequent actions -- hexagram themes (chi-squared p = 0.95) and Tarot card postures (chi-squared p = 0.69) are both independent of action choice -- suggesting the modulation operates through the reflective process, not content-following. We present this as an observation paper establishing that alignment-framework choice at the agent level produces distinctive system-level consequences in multi-agent settings.

翻译：大型语言模型在作为策略性智能体部署时，会表现出固有的行为倾向——尤其是一种倾向于防御性策略的风险规避"乌龟"偏差。我们研究表明，以每轮反思提示形式注入一个智能体的符号化推理框架，能够差异化地调节这种偏差，并重塑多智能体生态系统，产生特定框架下的胜者分布。在7玩家战国策外交变体游戏中（41场对局、4种条件、单战役记忆累积），每种框架都会产生独特的生态系统特征：在控制条件下，燕国主导（7/11，64%）；在易经蓍草占卜条件下，燕国与楚国共治，而秦国完全被压制（0/10）；在塔罗牌条件下，秦国主导（5/10，Fisher检验与合并组对比p=0.006）；在乱序文本消融条件下（保留提示结构的无意义预言文本），齐国主导（5/10，Fisher检验与合并组对比p=0.006）。接收框架的智能体（韩国）从未获胜，且在各条件下存活率无显著差异（Fisher检验p=1.0），但塔罗牌条件持续提升韩国的领土峰值（平均3.0个控制单位，其他条件2.1-2.5，Kruskal-Wallis检验p=0.010）。两种框架的内容均无法预测后续行为——卦象主题（卡方检验p=0.95）和塔罗牌姿势（卡方检验p=0.69）均与行动选择独立——表明调节作用通过反思过程本身而非内容遵循机制实现。本文作为观察性研究报告，证实智能体层面的对齐框架选择在多智能体情境中会产生独特的系统级后果。