Conventional agent systems often struggle in open-ended environments where task distributions continuously drift and external supervision is scarce. Their reliance on static toolsets or offline training lags behind these dynamics, leaving the system's capability boundaries rigid and unknown. To address this, we propose the In-Situ Self-Evolving paradigm. This approach treats sequential task interactions as a continuous stream of experience, enabling the system to distill short-term execution feedback into long-term, reusable capabilities without access to ground-truth labels. Within this framework, we identify tool evolution as the critical pathway for capability expansion, which provides verifiable, binary feedback signals. Within this framework, we develop Yunjue Agent, a system that iteratively synthesizes, optimizes, and reuses tools to navigate emerging challenges. To optimize evolutionary efficiency, we further introduce a Parallel Batch Evolution strategy. Empirical evaluations across five diverse benchmarks under a zero-start setting demonstrate significant performance gains over proprietary baselines. Additionally, complementary warm-start evaluations confirm that the accumulated general knowledge can be seamlessly transferred to novel domains. Finally, we propose a novel metric to monitor evolution convergence, serving as a function analogous to training loss in conventional optimization. We open-source our codebase, system traces, and evolved tools to facilitate future research in resilient, self-evolving intelligence.
翻译:传统智能体系统在开放环境中往往表现不佳,此类环境中任务分布持续漂移且外部监督稀缺。这些系统对静态工具集或离线训练的依赖难以适应动态变化,导致其能力边界僵化且未知。为解决这一问题,我们提出原位自进化范式。该方法将序列化任务交互视为连续的经验流,使系统能够将短期执行反馈提炼为长期可重用的能力,且无需真实标签。在此框架中,我们识别出工具进化作为能力扩展的关键路径,其可提供可验证的二元反馈信号。基于此框架,我们开发了Yunjue Agent系统,通过迭代合成、优化和重用工具来应对新兴挑战。为优化进化效率,我们进一步提出并行批量进化策略。在零启动设置下对五个多样化基准的实证评估表明,该系统相比专有基线模型取得了显著性能提升。此外,补充性热启动实验证实,系统积累的通用知识能够无缝迁移至新领域。最后,我们提出一种监测进化收敛的新颖指标,其功能类似于传统优化中的训练损失函数。我们开源了代码库、系统轨迹及进化工具,以促进韧性自进化智能领域的未来研究。