Benchmarking Large Language Models for Molecule Prediction Tasks

Large Language Models (LLMs) stand at the forefront of a number of Natural Language Processing (NLP) tasks. Despite the widespread adoption of LLMs in NLP, much of their potential in broader fields remains largely unexplored, and significant limitations persist in their design and implementation. Notably, LLMs struggle with structured data, such as graphs, and often falter when tasked with answering domain-specific questions requiring deep expertise, such as those in biology and chemistry. In this paper, we explore a fundamental question: Can LLMs effectively handle molecule prediction tasks? Rather than pursuing top-tier performance, our goal is to assess how LLMs can contribute to diverse molecule tasks. We identify several classification and regression prediction tasks across six standard molecule datasets. Subsequently, we carefully design a set of prompts to query LLMs on these tasks and compare their performance with existing Machine Learning (ML) models, which include text-based models and those specifically designed for analysing the geometric structure of molecules. Our investigation reveals several key insights: Firstly, LLMs generally lag behind ML models in achieving competitive performance on molecule tasks, particularly when compared to models adept at capturing the geometric structure of molecules, highlighting the constrained ability of LLMs to comprehend graph data. Secondly, LLMs show promise in enhancing the performance of ML models when used collaboratively. Lastly, we engage in a discourse regarding the challenges and promising avenues to harness LLMs for molecule prediction tasks. The code and models are available at https://github.com/zhiqiangzhongddu/LLMaMol.

翻译：【翻译摘要】大型语言模型在众多自然语言处理任务中处于前沿地位。尽管LLMs在NLP领域已被广泛采用，但其在更广泛领域的潜力仍大多未被探索，且在其设计与实现中存在显著局限性。值得注意的是，LLMs在处理结构化数据（如图结构）时面临困难，且在需要深度领域知识（如生物学与化学）的特定领域问题回答中时常表现不佳。本文探究一个基础性问题：LLMs能否有效处理分子预测任务？我们的目标并非追求顶尖性能，而是评估LLMs如何为多样化分子任务做出贡献。我们基于六个标准分子数据集确定了多项分类与回归预测任务，继而精心设计一组提示词以查询LLMs在这些任务上的表现，并将其与现有机器学习模型（包括基于文本的模型及专门分析分子几何结构的模型）进行性能对比。研究揭示了若干关键发现：首先，在分子任务上，LLMs整体上落后于ML模型——尤其相较于擅长捕捉分子几何结构的模型，凸显了LLMs理解图数据能力的局限。其次，当与ML模型协同使用时，LLMs展现出提升其性能的潜力。最后，我们围绕利用LLMs完成分子预测任务所面临的挑战与可行方向展开讨论。相关代码与模型已开源至https://github.com/zhiqiangzhongddu/LLMaMol。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日