While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulations does not make a researcher. What distinguishes research from routine execution is the progressive accumulation of knowledge -- learning which approaches fail, recognizing patterns across systems, and applying understanding to new problems. However, the prevailing paradigm in AI-driven computational science treats each execution in isolation, largely discarding hard-won insights between runs. Here we present QMatSuite, an open-source platform closing this gap. Agents record findings with full provenance, retrieve knowledge before new calculations, and in dedicated reflection sessions correct erroneous findings and synthesize observations into cross-compound patterns. In benchmarks on a six-step quantum-mechanical simulation workflow, accumulated knowledge reduces reasoning overhead by 67% and improves accuracy from 47% to 3% deviation from literature -- and when transferred to an unfamiliar material, achieves 1% deviation with zero pipeline failures.
翻译:尽管大型语言模型(LLM)已将AI智能体转变为计算材料科学领域的熟练执行者,但执行上百次模拟并不能造就合格的研究者。研究区别于常规执行的关键在于知识的渐进式积累——包括识别失败方法、发现跨体系规律,以及将认知迁移至新问题的能力。然而,当前AI驱动计算科学的主流范式将每次计算视为孤立事件,大量遗失了实验间来之不易的认知成果。本文提出开源平台QMatSuite以弥合这一鸿沟:智能体通过完整溯源记录研究发现,在新计算前检索既有知识,并在专项反思环节中修正错误结论,将观测结果综合为跨化合物规律。在六步量子力学模拟流程的基准测试中,知识积累使推理开销降低67%,结果准确率从偏离文献值47%提升至3%;当迁移至陌生材料体系时,该平台实现了1%的偏差率且保持零流程失败率。