Accurate and scalable exchange-correlation with deep learning

Giulia Luise,Chin-Wei Huang,Thijs Vogels,Derk P. Kooi,Sebastian Ehlert,Stephanie Lanius,Klaas J. H. Giesbertz,Amir Karton,Deniz Gunceler,Megan Stanley,Wessel P. Bruinsma,Lin Huang,Xinran Wei,José Garrido Torres,Abylay Katbashev,Rodrigo Chavez Zavaleta,Bálint Máté,Sékou-Oumar Kaba,Roberto Sordillo,Yingrong Chen,David B. Williams-Young,Christopher M. Bishop,Jan Hermann,Rianne van den Berg,Paola Gori-Giorgi

from arxiv, Main: 13 pages plus references, 11 figures and tables. Supplementary information: 19 pages, 12 figures and tables. v2 update: fix rendering of figure 1 and part of figure 5 in Safari PDF viewer. v3 update: update author information and fix typo. The Skala model and inference code are available under MIT license at https://github.com/microsoft/skala

Density Functional Theory (DFT) is the most widely used electronic structure method for predicting the properties of molecules and materials. Although DFT is, in principle, an exact reformulation of the Schrödinger equation, practical applications rely on approximations to the unknown exchange-correlation (XC) functional. Most existing XC functionals are constructed using a limited set of increasingly complex, hand-crafted features that improve accuracy at the expense of computational efficiency. Yet, no current approximation achieves the accuracy and generality for predictive modeling of laboratory experiments at chemical accuracy -- typically defined as errors below 1 kcal/mol. In this work, we present Skala, a modern deep learning-based XC functional that bypasses expensive hand-designed features by learning representations directly from data. Skala achieves chemical accuracy for atomization energies of small molecules while retaining the computational efficiency typical of semi-local DFT. This performance is enabled by training on an unprecedented volume of high-accuracy reference data generated using computationally intensive wavefunction-based methods. Notably, Skala systematically improves with additional training data covering diverse chemistry. By incorporating a modest amount of additional high-accuracy data tailored to chemistry beyond atomization energies, Skala achieves accuracy competitive with the best-performing hybrid functionals across general main group chemistry, at the cost of semi-local DFT. As the training dataset continues to expand, Skala is poised to further enhance the predictive power of first-principles simulations.

翻译：密度泛函理论（DFT）是预测分子和材料性质最广泛使用的电子结构方法。尽管DFT在原理上是薛定谔方程的一种精确重构，但实际应用依赖于对未知交换关联（XC）泛函的近似。现有的大多数XC泛函是使用一组有限且日益复杂的手工设计特征构建的，这些特征以提高计算效率为代价来改善精度。然而，目前尚无任何近似方法能在化学精度（通常定义为误差低于1 kcal/mol）下实现对实验室实验预测建模所需的精度与普适性。本工作中，我们提出了Skala，一种基于现代深度学习的XC泛函，它通过直接从数据中学习表示，绕过了昂贵的手工设计特征。Skala在小分子原子化能的计算上达到了化学精度，同时保持了半局域DFT典型的计算效率。这一性能得益于使用计算密集的基于波函数方法生成的空前规模的高精度参考数据进行训练。值得注意的是，Skala通过覆盖多样化化学性质的额外训练数据实现了系统性提升。通过纳入少量针对原子化能之外化学性质定制的高精度数据，Skala以半局域DFT的计算成本，在通用主族化学领域达到了与性能最佳的杂化泛函相竞争的精度。随着训练数据集的持续扩展，Skala有望进一步增强第一性原理模拟的预测能力。