Large language models (LLMs) have emerged as powerful tools in chemistry, significantly impacting molecule design, property prediction, and synthesis optimization. This review highlights LLM capabilities in these domains and their potential to accelerate scientific discovery through automation. We also review LLM-based autonomous agents: LLMs with a broader set of tools to interact with their surrounding environment. These agents perform diverse tasks such as paper scraping, interfacing with automated laboratories, and synthesis planning. As agents are an emerging topic, we extend the scope of our review of agents beyond chemistry and discuss across any scientific domains. This review covers the recent history, current capabilities, and design of LLMs and autonomous agents, addressing specific challenges, opportunities, and future directions in chemistry. Key challenges include data quality and integration, model interpretability, and the need for standard benchmarks, while future directions point towards more sophisticated multi-modal agents and enhanced collaboration between agents and experimental methods. Due to the quick pace of this field, a repository has been built to keep track of the latest studies: https://github.com/ur-whitelab/LLMs-in-science.
翻译:大型语言模型(LLMs)已成为化学领域的重要工具,在分子设计、性质预测和合成优化等方面产生了显著影响。本综述重点阐述了LLMs在这些领域的能力及其通过自动化加速科学发现的潜力。我们还回顾了基于LLM的自主智能体:即配备更广泛工具集以与环境交互的LLMs。这些智能体能够执行多种任务,例如文献抓取、对接自动化实验室以及合成路线规划。鉴于智能体是一个新兴研究方向,我们将智能体的讨论范围拓展至化学以外的其他科学领域。本综述涵盖LLMs与自主智能体的近期发展历程、现有能力及系统设计,并探讨化学领域面临的具体挑战、发展机遇与未来方向。关键挑战包括数据质量与整合、模型可解释性以及标准基准的建立,而未来方向则指向更复杂的多模态智能体以及智能体与实验方法之间更强的协同合作。鉴于该领域发展迅速,我们建立了资源库以追踪最新研究进展:https://github.com/ur-whitelab/LLMs-in-science。