Multi-agent systems (MAS) built on large language models promise improved problem-solving through collaboration, yet they often fail to consistently outperform strong single-agent baselines due to error propagation at inter-agent message handoffs.In this work, we conduct a systematic empirical analysis of such failures and introduce an edge-level error taxonomy that identifies four dominant error types: Data Gap, Signal Corruption, Referential Drift, and Capability Gap, as primary sources of failure in multi-agent interactions. Building on this taxonomy, we propose AgentAsk, a lightweight clarification module designed to intervene at the edge level in MAS to prevent cascading errors. The module operates by strategically applying minimal clarifications at critical points within the system, improving the accuracy and efficiency of the overall task. AgentAsk is trained to balance the trade-offs between clarification cost, latency, and accuracy, while it is also architecture-agnostic and can be easily integrated into existing systems. Evaluated across five benchmarks, AgentAsk consistently improves accuracy by up to 4.69%, while keeping latency and extra costs below 10% compared to baseline MAS, showcasing its high efficiency and minimal overhead.
翻译:基于大语言模型构建的多智能体系统(MAS)虽有望通过协作提升问题解决能力,但由于智能体间信息传递过程中的误差传播,其表现往往无法持续超越强大的单智能体基线。本研究对此类失败案例进行了系统性实证分析,并提出一种边级误差分类法,识别出导致多智能体交互失败的四种主导误差类型:数据鸿沟、信号失真、指代漂移与能力鸿沟。基于此分类框架,我们提出AgentAsk——一种轻量级澄清模块,通过在MAS的边层级进行干预来防止级联误差。该模块通过在系统关键节点实施最小化澄清策略,提升整体任务的准确性与效率。AgentAsk经过训练能够权衡澄清成本、延迟与精度之间的平衡,同时具备架构无关性,可轻松集成至现有系统。在五个基准测试上的评估表明,相较于基线MAS,AgentAsk持续将准确率提升最高达4.69%,同时将额外延迟与成本控制在10%以下,展现出高效率和极低开销的特性。