Interatomic potentials learned using machine learning methods have been successfully applied to atomistic simulations. However, deep learning pipelines are notoriously data-hungry, while generating reference calculations is computationally demanding. To overcome this difficulty, we propose a transfer learning algorithm that leverages the ability of graph neural networks (GNNs) in describing chemical environments, together with kernel mean embeddings. We extract a feature map from GNNs pre-trained on the OC20 dataset and use it to learn the potential energy surface from system-specific datasets of catalytic processes. Our method is further enhanced by a flexible kernel function that incorporates chemical species information, resulting in improved performance and interpretability. We test our approach on a series of realistic datasets of increasing complexity, showing excellent generalization and transferability performance, and improving on methods that rely on GNNs or ridge regression alone, as well as similar fine-tuning approaches. We make the code available to the community at https://github.com/IsakFalk/atomistic_transfer_mekrr.
翻译:采用机器学习方法学习的原子间势能已成功应用于原子模拟中。然而,深度学习流水线对数据的需求量极大,而生成参考计算在计算上非常耗时。为克服这一难题,我们提出一种迁移学习算法,该算法充分利用图神经网络在描述化学环境方面的能力,并结合核均值嵌入。我们从在OC20数据集上预训练的图神经网络中提取特征映射,并利用它从催化过程的特定系统数据集中学习势能面。我们的方法通过一种灵活的核函数进一步增强,该函数融入了化学物种信息,从而提升了性能与可解释性。我们在系列复杂度递增的实际数据集上测试了该方法,展示了优异的泛化与迁移性能,并优于仅依赖图神经网络或岭回归的方法以及类似的微调方法。我们将代码开源供社区使用,地址为https://github.com/IsakFalk/atomistic_transfer_mekrr。