Large language model agents have demonstrated remarkable advancements across various complex tasks. Recent works focus on optimizing the agent team or employing self-reflection to iteratively solve complex tasks. Since these agents are all based on the same LLM, only conducting self-evaluation or removing underperforming agents does not substantively enhance the capability of the agents. We argue that a comprehensive evaluation and accumulating experience from evaluation feedback is an effective approach to improving system performance. In this paper, we propose Reusable Experience Accumulation with 360{\deg} Assessment (360{\deg}REA), a hierarchical multi-agent framework inspired by corporate organizational practices. The framework employs a novel 360{\deg} performance assessment method for multi-perspective performance evaluation with fine-grained assessment. To enhance the capability of agents in addressing complex tasks, we introduce dual-level experience pool for agents to accumulate experience through fine-grained assessment. Extensive experiments on complex task datasets demonstrate the effectiveness of 360{\deg}REA.
翻译:大型语言模型智能体在各类复杂任务中已展现出显著进步。近期研究聚焦于优化智能体团队或利用自我反思机制逐步解决复杂任务。由于这些智能体均基于同一大型语言模型,仅进行自我评价或剔除表现欠佳的智能体无法实质性提升其能力。我们认为,全面评估并积累评估反馈经验是提升系统性能的有效途径。本文提出基于360°评估的可复用经验积累方法(360°REA),这是一种受企业组织实践启发的分层多智能体框架。该框架采用新颖的360°绩效评估方法,实现多角度性能细粒度评估。为增强智能体处理复杂任务的能力,我们引入双层经验池机制,使智能体通过细粒度评估积累经验。在复杂任务数据集上的大量实验验证了360°REA的有效性。