Large language models (LLMs) have revolutionized medical reasoning tasks, yet single-agent systems often falter on complex, interdisciplinary problems requiring robust handling of uncertainty and conflicting evidence. Multi-agent systems (MAS) leveraging LLMs enable collaborative intelligence, but prevailing centralized architectures suffer from scalability bottlenecks, single points of failure, and role confusion in resource-constrained environments. Decentralized MAS (D-MAS) promise enhanced autonomy and resilience via peer-to-peer interactions, but their application to high-stakes healthcare domains remains underexplored. We introduce MediHive, a novel decentralized multi-agent framework for medical question answering that integrates a shared memory pool with iterative fusion mechanisms. MediHive deploys LLM-based agents that autonomously self-assign specialized roles, conduct initial analyses, detect divergences through conditional evidence-based debates, and locally fuse peer insights over multiple rounds to achieve consensus. Empirically, MediHive outperforms single-LLM and centralized baselines on MedQA and PubMedQA datasets, attaining accuracies of 84.3% and 78.4%, respectively. Our work advances scalable, fault-tolerant D-MAS for medical AI, addressing key limitations of centralized designs while demonstrating superior performance in reasoning-intensive tasks.
翻译:大型语言模型(LLM)虽革新了医学推理任务,但单智能体系统在处理需稳健应对不确定性与冲突证据的复杂跨学科问题时常显不足。基于LLM的多智能体系统(MAS)虽能实现协作智能,但现有集中式架构在资源受限环境中存在扩展性瓶颈、单点故障与角色混淆问题。去中心化MAS(D-MAS)通过点对点交互机制提升了自主性与鲁棒性,但其在高风险医疗领域的应用仍属空白。本文提出MediHive——一种集成共享记忆池与迭代融合机制的新型去中心化多智能体医疗问答框架。该框架部署基于LLM的智能体,使其自主分配专业角色、开展初始分析、通过条件性证据辩论检测分歧,并经多轮局部融合同伴见解达成共识。实验表明,MediHive在MedQA与PubMedQA数据集上分别取得84.3%与78.4%的准确率,超越单LLM与集中式基线模型。本研究推动了医疗AI领域可扩展、容错型D-MAS的发展,在缓解集中式设计局限性的同时,展示了推理密集型任务的卓越性能。