Effectively representing materials as text has the potential to leverage the vast advancements of large language models (LLMs) for discovering new materials. While LLMs have shown remarkable success in various domains, their application to materials science remains underexplored. A fundamental challenge is the lack of understanding of how to best utilize text-based representations for materials modeling. This challenge is further compounded by the absence of a comprehensive benchmark to rigorously evaluate the capabilities and limitations of these text representations in capturing the complexity of material systems. To address this gap, we propose MatText, a suite of benchmarking tools and datasets designed to systematically evaluate the performance of language models in modeling materials. MatText encompasses nine distinct text-based representations for material systems, including several novel representations. Each representation incorporates unique inductive biases that capture relevant information and integrate prior physical knowledge about materials. Additionally, MatText provides essential tools for training and benchmarking the performance of language models in the context of materials science. These tools include standardized dataset splits for each representation, probes for evaluating sensitivity to geometric factors, and tools for seamlessly converting crystal structures into text. Using MatText, we conduct an extensive analysis of the capabilities of language models in modeling materials. Our findings reveal that current language models consistently struggle to capture the geometric information crucial for materials modeling across all representations. Instead, these models tend to leverage local information, which is emphasized in some of our novel representations. Our analysis underscores MatText's ability to reveal shortcomings of text-based methods for materials design.
翻译:将材料有效表示为文本,有望利用大型语言模型(LLM)的巨大进展来发现新材料。尽管LLM在多个领域取得了显著成功,但其在材料科学中的应用仍待深入探索。一个根本性挑战在于,如何最佳利用基于文本的表示进行材料建模尚未得到充分理解。由于缺乏全面的基准来严格评估这些文本表示在捕捉材料系统复杂性方面的能力与局限,这一挑战进一步加剧。为填补这一空白,我们提出了MatText——一套旨在系统评估语言模型在材料建模中性能的基准工具与数据集。MatText涵盖九种不同的材料系统文本表示方法,其中包括若干新颖的表示形式。每种表示都融合了独特的归纳偏置,以捕捉相关信息并整合材料相关的先验物理知识。此外,MatText提供了在材料科学背景下训练和评估语言模型性能的关键工具,包括每种表示的标准数据集划分、评估几何因素敏感性的探针,以及将晶体结构无缝转换为文本的工具。基于MatText,我们对语言模型在材料建模中的能力进行了广泛分析。研究发现,当前的语言模型在所有表示形式中均难以捕捉对材料建模至关重要的几何信息。相反,这些模型倾向于利用局部信息——这在我们提出的某些新颖表示中得到了强调。我们的分析凸显了MatText在揭示基于文本的材料设计方法缺陷方面的能力。