Large Language Model (LLM)-based agents have recently shown impressive capabilities in complex reasoning and tool use via multi-step interactions with their environments. While these agents have the potential to tackle complicated tasks, their problem-solving process, i.e., agents' interaction trajectory leading to task completion, remains underexploited. These trajectories contain rich feedback that can navigate agents toward the right directions for solving problems correctly. Although prevailing approaches, such as Monte Carlo Tree Search (MCTS), can effectively balance exploration and exploitation, they ignore the interdependence among various trajectories and lack the diversity of search spaces, which leads to redundant reasoning and suboptimal outcomes. To address these challenges, we propose SE-Agent, a Self-Evolution framework that enables Agents to optimize their reasoning processes iteratively. Our approach revisits and enhances former pilot trajectories through three key operations: revision, recombination, and refinement. This evolutionary mechanism enables two critical advantages: (1) it expands the search space beyond local optima by intelligently exploring diverse solution paths guided by previous trajectories, and (2) it leverages cross-trajectory inspiration to efficiently enhance performance while mitigating the impact of suboptimal reasoning paths. Through these mechanisms, SE-Agent achieves continuous self-evolution that incrementally improves reasoning quality. We evaluate SE-Agent on SWE-bench Verified to resolve real-world GitHub issues. Experimental results across five strong LLMs show that integrating SE-Agent delivers up to 55% relative improvement, achieving state-of-the-art performance among all open-source agents on SWE-bench Verified. Our code and demonstration materials are publicly available at https://github.com/JARVIS-Xs/SE-Agent.
翻译:基于大语言模型(LLM)的智能体近期通过与环境的多步交互,在复杂推理与工具使用方面展现出令人印象深刻的能力。尽管这些智能体具备处理复杂任务的潜力,但其问题解决过程——即智能体导向任务完成的交互轨迹——仍未得到充分探索。这些轨迹蕴含丰富的反馈信息,能够引导智能体朝正确方向解决问题。尽管主流方法(如蒙特卡洛树搜索MCTS)能有效平衡探索与利用,但它们忽略了不同轨迹间的相互依赖性,且搜索空间缺乏多样性,导致冗余推理和次优结果。为应对这些挑战,我们提出SE-Agent——一种使智能体能够迭代优化其推理过程的自进化框架。我们的方法通过修订、重组与精炼三项核心操作,对先前的探索轨迹进行回溯与增强。这种进化机制带来两大关键优势:(1)通过基于历史轨迹的智能探索多样化解决路径,将搜索空间扩展至局部最优解之外;(2)利用跨轨迹启发效应,在缓解次优推理路径影响的同时高效提升性能。通过这些机制,SE-Agent实现了持续自进化,逐步提升推理质量。我们在SWE-bench Verified基准上评估SE-Agent解决真实世界GitHub问题的能力。在五种强大LLM上的实验结果表明,集成SE-Agent可带来最高55%的相对性能提升,在SWE-bench Verified的所有开源智能体中达到最先进的性能水平。我们的代码与演示材料已公开于https://github.com/JARVIS-Xs/SE-Agent。