As foundation models are increasingly deployed as interacting agents in multi-agent systems, their collective behavior raises new challenges for trustworthiness, transparency, and accountability. Traditional coordination mechanisms, such as centralized oversight or adversarial adjudication, struggle to scale and often obscure how decisions emerge. We introduce a market-making framework for multi-agent large language model (LLM) coordination that organizes agent interactions as structured economic exchanges. In this setup, each agent acts as a market participant, updating and trading probabilistic beliefs, to converge toward shared, truthful outcomes. By aligning local incentives with collective epistemic goals, the framework promotes self-organizing, verifiable reasoning without requiring external enforcement. Empirically, we evaluate this approach across factual reasoning, ethical judgment, and commonsense inference tasks. Market-based coordination yields accuracy gains of up to 10% over single-shot baselines while preserving interpretability and transparency of intermediate reasoning steps. Beyond these improvements, our findings demonstrate that economic coordination principles can operationalize accountability and robustness in multi-agent LLM systems, offering a scalable pathway toward self-correcting, socially responsible AI capable of maintaining trust and oversight in real world deployment scenarios.
翻译:随着基础模型越来越多地作为交互智能体部署于多智能体系统中,其集体行为对可信性、透明度和问责制提出了新的挑战。传统的协调机制(如集中监督或对抗性裁决)难以扩展,且常常模糊决策产生的过程。我们提出了一种用于多智能体大语言模型(LLM)协调的做市框架,将智能体间的交互组织为结构化的经济交换。在此框架中,每个智能体作为市场参与者,通过更新和交易概率信念,最终收敛于共享且真实的共识结果。该框架通过将局部激励与集体认知目标对齐,促进了无需外部强制干预的自组织、可验证推理过程。我们在事实推理、伦理判断和常识推理任务上对该方法进行了实证评估。基于市场的协调机制相比单次推理基线实现了高达10%的准确率提升,同时保持了中间推理步骤的可解释性与透明度。除性能提升外,我们的研究结果表明:经济协调原则能够实现多智能体LLM系统的问责性与鲁棒性,为构建具有自我纠正能力、符合社会责任的人工智能系统提供了可扩展路径,使其能够在实际部署场景中持续保持可信性与可监督性。