This paper introduces a novel framework for simulating and analyzing how uncooperative behaviors can destabilize or collapse LLM-based multi-agent systems. Our framework includes two key components: (1) a game theory-based taxonomy of uncooperative agent behaviors, addressing a notable gap in the existing literature; and (2) a structured, multi-stage simulation pipeline that dynamically generates and refines uncooperative behaviors as agents' states evolve. We evaluate the framework via a collaborative resource management setting, measuring system stability using metrics such as survival time and resource overuse rate. Empirically, our framework achieves 96.7% accuracy in generating realistic uncooperative behaviors, validated by human evaluations. Our results reveal a striking contrast: cooperative agents maintain perfect system stability (100% survival over 12 rounds with 0% resource overuse), while any uncooperative behavior can trigger rapid system collapse within 1 to 7 rounds. These findings demonstrate that uncooperative agents can significantly degrade collective outcomes, highlighting the need for designing more resilient multi-agent systems.
翻译:本文提出了一种新颖的框架,用于模拟和分析不合作行为如何破坏或导致基于LLM的多智能体系统崩溃。我们的框架包含两个关键组成部分:(1)基于博弈论的不合作智能体行为分类法,填补了现有文献中的一个显著空白;以及(2)一个结构化的多阶段模拟流程,能够随着智能体状态的演变动态生成并优化不合作行为。我们通过一个协作式资源管理场景来评估该框架,使用生存时间和资源过度使用率等指标衡量系统稳定性。实证结果表明,我们的框架在生成真实不合作行为方面达到了96.7%的准确率,并通过人工评估验证。我们的研究结果揭示了一个鲜明对比:合作智能体能够维持完美的系统稳定性(12轮中100%生存且资源过度使用率为0%),而任何不合作行为都可能在1至7轮内引发系统快速崩溃。这些发现表明,不合作智能体会显著损害集体成果,突显了设计更具韧性的多智能体系统的必要性。