ChemCrow: Augmenting large-language models with chemistry tools

Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 18 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and Chemcrow's performance. Our work not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry.

翻译：过去几十年间，优秀的计算化学工具层出不穷。将它们整合到单一平台并提升可及性，有助于克服陡峭的学习曲线，充分发挥其潜力。近期，大型语言模型（LLM）在跨领域任务中展现出强大性能，但在处理化学相关问题时仍存在困难。此外，这些模型缺乏访问外部知识库的途径，限制了其在科学应用中的实用性。本研究提出ChemCrow——一种专为有机合成、药物发现和材料设计任务设计的LLM化学智能体。通过整合18个专家设计的工具，ChemCrow增强了LLM在化学领域的表现，并催生了新能力。该智能体自主规划并执行了驱虫剂、三种有机催化剂的合成，还引导发现了新型发色团。我们的评估（包含LLM评估与专家评估）表明，ChemCrow在自动化多种化学任务方面具有高效性。令人惊讶的是，我们发现以GPT-4作为评估器时，其无法区分明显错误的GPT-4输出与ChemCrow的表现。本研究不仅能为化学专家提供辅助、降低非专业人士的使用门槛，更通过弥合实验化学与计算化学之间的鸿沟，推动科学进步。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日