Siamese Foundation Models for Crystal Structure Prediction

Crystal Structure Prediction (CSP), which aims to generate stable crystal structures from compositions, represents a critical pathway for discovering novel materials. While structure prediction tasks in other domains, such as proteins, have seen remarkable progress, CSP remains a relatively underexplored area due to the more complex geometries inherent in crystal structures. In this paper, we propose Siamese foundation models specifically designed to address CSP. Our pretrain-finetune framework, named DAO, comprises two complementary foundation models: DAO-G for structure generation and DAO-P for energy prediction. Experiments on CSP benchmarks (MP-20 and MPTS-52) demonstrate that our DAO-G significantly surpasses state-of-the-art (SOTA) methods across all metrics. Extensive ablation studies further confirm that DAO-G excels in generating diverse polymorphic structures, and the dataset relaxation and energy guidance provided by DAO-P are essential for enhancing DAO-G's performance. When applied to three real-world superconductors ($\text{CsV}_3\text{Sb}_5$, $ \text{Zr}_{16}\text{Rh}_8\text{O}_4$ and $\text{Zr}_{16}\text{Pd}_8\text{O}_4$) that are known to be challenging to analyze, our foundation models achieve accurate critical temperature predictions and structure generations. For instance, on $\text{CsV}_3\text{Sb}_5$, DAO-G generates a structure close to the experimental one with an RMSE of 0.0085; DAO-P predicts the $T_c$ value with high accuracy (2.26 K vs. the ground-truth value of 2.30 K). In contrast, conventional DFT calculators like Quantum Espresso only successfully derive the structure of the first superconductor within an acceptable time, while the RMSE is nearly 8 times larger, and the computation speed is more than 1000 times slower. These compelling results collectively highlight the potential of our approach for advancing materials science research and development.

翻译：晶体结构预测（CSP）旨在从化学成分生成稳定的晶体结构，是发现新型材料的关键途径。尽管蛋白质等其他领域的结构预测任务已取得显著进展，但由于晶体结构固有的更复杂几何特性，CSP仍是一个相对未被充分探索的领域。本文提出专门为解决CSP设计的孪生基础模型。我们的预训练-微调框架DAO包含两个互补的基础模型：用于结构生成的DAO-G和用于能量预测的DAO-P。在CSP基准测试（MP-20和MPTS-52）上的实验表明，我们的DAO-G在所有指标上均显著超越现有最优方法。广泛的消融研究进一步证实，DAO-G在生成多样化多晶型结构方面表现优异，而DAO-P提供的数据集弛豫和能量引导对于提升DAO-G性能至关重要。当应用于三个已知难以分析的实际超导体（$\text{CsV}_3\text{Sb}_5$、$\text{Zr}_{16}\text{Rh}_8\text{O}_4$和$\text{Zr}_{16}\text{Pd}_8\text{O}_4$）时，我们的基础模型实现了精确的临界温度预测和结构生成。例如在$\text{CsV}_3\text{Sb}_5$上，DAO-G生成的结构与实验值高度接近（RMSE为0.0085）；DAO-P以高精度预测$T_c$值（2.26 K vs. 真实值2.30 K）。相比之下，Quantum Espresso等传统DFT计算器仅能在可接受时间内成功推导第一种超导体的结构，且其RMSE增大约8倍，计算速度慢1000倍以上。这些令人信服的结果共同凸显了我们方法在推动材料科学研究与发展方面的潜力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日