Modeling coordination among generative agents in complex multi-round decision-making presents a core challenge for AI and operations management. Although behavioral experiments have revealed cognitive biases behind supply chain inefficiencies, traditional methods face scalability and control limitations. We introduce a scalable experimental paradigm using Large Language Models (LLMs) to simulate multi-stage supply chain dynamics. Grounded in a Hierarchical Reasoning Framework, this study specifically analyzes the impact of cognitive heterogeneity on agent interactions. Unlike prior homogeneous settings, we employ DeepSeek and GPT agents to systematically vary reasoning sophistication across supply chain tiers. Through rigorously replicated and statistically validated simulations, we investigate how this cognitive diversity influences collective outcomes. Results indicate that agents exhibit myopic and self-interested behaviors that exacerbate systemic inefficiencies. However, we demonstrate that information sharing effectively mitigates these adverse effects. Our findings extend traditional behavioral methods and offer new insights into the dynamics of AI-enabled organizations. This work underscores both the potential and limitations of LLM-based agents as proxies for human decision-making in complex operational environments.
翻译:在复杂多轮决策中建模生成式智能体间的协调,是人工智能与运营管理的核心挑战。尽管行为实验已揭示供应链效率低下背后的认知偏差,但传统方法面临可扩展性与控制能力的局限。本文提出一种可扩展的实验范式,利用大语言模型模拟多阶段供应链动态。基于分层推理框架,本研究重点分析认知异质性对智能体交互的影响。与以往同质化设定不同,我们采用DeepSeek和GPT智能体,在供应链各层级系统性地改变推理复杂度。通过严格重复且经统计验证的模拟实验,探究认知多样性如何影响集体决策结果。结果表明,智能体表现出的短视与自利行为会加剧系统效率损失,但信息共享能有效缓解此类负面影响。本研究拓展了传统行为方法,为AI赋能的组织动力学提供了新见解,同时揭示了大语言模型智能体在复杂运营环境中作为人类决策代理的潜力与局限。