EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery

The increasing adoption of Large Language Models (LLMs) has enabled AI scientists to perform complex end-to-end scientific discovery tasks requiring coordination of specialized roles, including idea generation and experimental execution. However, most state-of-the-art AI scientist systems rely on static, hand-designed pipelines and fail to adapt based on accumulated interaction histories. As a result, these systems overlook promising research directions, repeat failed experiments, and pursue infeasible ideas. To address this, we introduce EvoScientist, an evolving multi-agent AI scientist framework that continuously improves research strategies through persistent memory and self-evolution. EvoScientist comprises three specialized agents: a Researcher Agent (RA) for scientific idea generation, an Engineer Agent (EA) for experiment implementation and execution, and an Evolution Manager Agent (EMA) that distills insights from prior interactions into reusable knowledge. EvoScientist contains two persistent memory modules: (i) an ideation memory, which summarizes feasible research directions from top-ranked ideas while recording previously unsuccessful directions; and (ii) an experimentation memory, which captures effective data processing and model training strategies derived from code search trajectories and best-performing implementations. These modules enable the RA and EA to retrieve relevant prior strategies, improving idea quality and code execution success rates over time. Experiments show that EvoScientist outperforms 7 open-source and commercial state-of-the-art systems in scientific idea generation, achieving higher novelty, feasibility, relevance, and clarity via automatic and human evaluation. EvoScientist also substantially improves code execution success rates through multi-agent evolution, demonstrating persistent memory's effectiveness for end-to-end scientific discovery.

翻译：大型语言模型（LLM）的日益普及使得AI科学家能够执行需要协调多个专业角色（包括想法生成与实验执行）的复杂端到端科学发现任务。然而，大多数最先进的AI科学家系统依赖于静态、人工设计的流程，无法基于累积的交互历史进行自适应调整。因此，这些系统常忽略有前景的研究方向、重复失败的实验并追求不可行的想法。为解决此问题，我们提出了EvoScientist——一个通过持久性记忆与自我演化持续改进研究策略的演化型多智能体AI科学家框架。EvoScientist包含三个专业智能体：负责科学想法生成的研究员智能体（RA）、负责实验实现与执行的工程师智能体（EA），以及从历史交互中提炼可复用知识的演化管理智能体（EMA）。EvoScientist配备两个持久性记忆模块：（i）构思记忆，通过总结高排名想法中的可行研究方向并记录先前失败方向；（ii）实验记忆，通过代码搜索轨迹和最佳实现方案捕获有效的数据处理与模型训练策略。这些模块使RA和EA能够检索相关的历史策略，从而持续提升想法质量与代码执行成功率。实验表明，在科学想法生成任务中，EvoScientist在自动与人工评估指标上均优于7个开源及商业最先进系统，实现了更高新颖性、可行性、相关性与清晰度。通过多智能体演化机制，EvoScientist还显著提升了代码执行成功率，验证了持久性记忆对端到端科学发现的有效性。

相关内容

关注 7111

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

OmniScientist: 迈向人类与 AI 科学家协同演化的生态系统

专知会员服务

19+阅读 · 1月19日

智能体评判者（Agent-as-a-Judge）研究综述

专知会员服务

37+阅读 · 1月9日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

46+阅读 · 1月8日

自进化人工智能体的全面综述：连接基础模型与终身自主智能系统的新范式

专知会员服务

36+阅读 · 2025年12月28日