Over the last decades, excellent computational chemistry tools have been developed. Their full potential has not yet been reached as most are challenging to learn and exist in isolation. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 18 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and Chemcrow's performance. There is a significant risk of misuse of tools like ChemCrow, and we discuss their potential harms. Employed responsibly, our work not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry. Publicly available code can be found at https://github.com/ur-whitelab/chemcrow-public
翻译:在过去几十年中,优秀的计算化学工具得到了开发。然而,由于大多数工具难以掌握且孤立存在,其全部潜力尚未充分发挥。近年来,大型语言模型(LLMs)在跨领域任务中展现出强大性能,但在化学相关问题上表现不佳。此外,这些模型缺乏对外部知识源的访问,限制了其在科学应用中的实用性。在本研究中,我们提出ChemCrow,一种旨在完成有机合成、药物发现和材料设计任务的LLM化学智能体。通过集成18个专家设计的工具,ChemCrow增强了LLM在化学领域的性能,并催生了新能力。我们的智能体自主规划并执行了驱虫剂、三种有机催化剂的合成,并指导了新型生色团的发现。我们的评估(包括LLM评估和专家评估)表明,ChemCrow在自动化多种化学任务方面具有有效性。令人惊讶的是,我们发现作为评估者的GPT-4无法区分明显错误的GPT-4输出结果与ChemCrow的性能。像ChemCrow这样的工具存在被滥用的重大风险,我们讨论了其潜在危害。在负责任地使用下,我们的工作不仅有助于化学专家并降低非专业人员的门槛,还通过弥合实验化学与计算化学之间的差距,促进科学进步。公开可用的代码见https://github.com/ur-whitelab/chemcrow-public。