While Multi-Agent Systems (MAS) excel at complex tasks, their growing autonomy with operational complexity often leads to critical inefficiencies, such as excessive token consumption and failures arising from misinformation. Existing methods primarily focus on post-hoc failure attribution, lacking proactive, real-time interventions to enhance robustness and efficiency. To this end, we introduce SupervisorAgent, a lightweight and modular framework for runtime, adaptive supervision that operates without altering the base agent's architecture. Triggered by an LLM-free adaptive filter, SupervisorAgent intervenes at critical junctures to proactively correct errors, guide inefficient behaviors, and purify observations. On the challenging GAIA benchmark, SupervisorAgent reduces the token consumption of the Smolagent framework by an average of 29.45% without compromising its success rate. Extensive experiments across five additional benchmarks (math reasoning, code generation, and question answering) and various SoTA foundation models validate the broad applicability and robustness of our approach. The code is available at https://github.com/LINs-lab/SupervisorAgent.
翻译:尽管多智能体系统(MAS)在处理复杂任务方面表现出色,但其日益增长的自主性和操作复杂性往往导致关键的低效问题,例如过度的令牌消耗以及由错误信息引发的故障。现有方法主要关注事后故障归因,缺乏主动、实时的干预机制以增强系统的鲁棒性和效率。为此,我们提出了SupervisorAgent,一个轻量级、模块化的运行时自适应监督框架,该框架无需改变基础智能体的架构即可运行。通过一个无需大型语言模型的自适应过滤器触发,SupervisorAgent在关键时刻进行干预,主动纠正错误、引导低效行为并净化观测数据。在具有挑战性的GAIA基准测试中,SupervisorAgent将Smolagent框架的令牌消耗平均降低了29.45%,同时未影响其成功率。在另外五个基准测试(数学推理、代码生成和问答)以及多种最先进的基础模型上进行的大量实验验证了我们方法的广泛适用性和鲁棒性。代码可在https://github.com/LINs-lab/SupervisorAgent获取。