ChemCrow: Augmenting large-language models with chemistry tools

Large-language models (LLMs) have recently shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 13 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our evaluation, including both LLM and expert human assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and GPT-4 + ChemCrow performance. There is a significant risk of misuse of tools like ChemCrow and we discuss their potential harms. Employed responsibly, ChemCrow not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry.

翻译：大型语言模型（LLMs）近期在跨领域任务中展现出强劲性能，但在处理化学相关问题方面仍存在困难。此外，这类模型缺乏访问外部知识源的途径，限制了其在科学应用中的实用性。本研究提出ChemCrow——一种专为有机合成、药物发现及材料设计领域任务设计的LLM化学智能体。通过整合13个专家设计的工具，ChemCrow显著增强了LLM在化学领域的表现，并催生了新的能力。我们的评估（涵盖LLM自动评估与人类专家评估）证实了ChemCrow在自动化执行多样化化学任务方面的有效性。值得注意的是，我们发现GPT-4作为评估者无法区分明显错误的GPT-4完成结果与GPT-4+ChemCrow的生成结果。类似ChemCrow的工具存在显著的滥用风险，本文对其潜在危害进行了讨论。若能被负责任地使用，ChemCrow不仅能为化学专家提供辅助、降低非专业人员的入门门槛，更能通过弥合实验化学与计算化学之间的鸿沟推动科学进步。

相关内容

GPT-4

关注 29

北京时间2023年3月15日凌晨，ChatGPT开发商OpenAI 发布了发布了全新的多模态预训练大模型 GPT-4，可以更可靠、更具创造力、能处理更细节的指令，根据图片和文字提示都能生成相应内容。具体来说来说，GPT-4 相比上一代的模型，实现了飞跃式提升：支持图像和文本输入，拥有强大的识图能力；大幅提升了文字输入限制，在ChatGPT模式下，GPT-4可以处理超过2.5万字的文本，可以处理一些更加细节的指令；回答准确性也得到了显著提高。

用ChatGPT训练羊驼：「白泽」开源，轻松构建专属模型，可在线试玩

专知会员服务

69+阅读 · 2023年4月5日

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

ChatAug: 利用ChatGPT进行文本数据增强

专知会员服务

81+阅读 · 2023年3月4日

「知识增强预训练语言模型」最新研究综述

专知会员服务

62+阅读 · 2022年11月18日