Generalizability of Functional Forms for Interatomic Potential Models Discovered by Symbolic Regression

In recent years there has been great progress in the use of machine learning algorithms to develop interatomic potential models. Machine-learned potential models are typically orders of magnitude faster than density functional theory but also orders of magnitude slower than physics-derived models such as the embedded atom method. In our previous work, we used symbolic regression to develop fast, accurate and transferrable interatomic potential models for copper with novel functional forms that resemble those of the embedded atom method. To determine the extent to which the success of these forms was specific to copper, here we explore the generalizability of these models to other face-centered cubic transition metals and analyze their out-of-sample performance on several material properties. We found that these forms work particularly well on elements that are chemically similar to copper. When compared to optimized Sutton-Chen models, which have similar complexity, the functional forms discovered using symbolic regression perform better across all elements considered except gold where they have a similar performance. They perform similarly to a moderately more complex embedded atom form on properties on which they were trained, and they are more accurate on average on other properties. We attribute this improved generalized accuracy to the relative simplicity of the models discovered using symbolic regression. The genetic programming models are found to outperform other models from the literature about 50% of the time in a variety of property predictions, with about 1/10th the model complexity on average. We discuss the implications of these results to the broader application of symbolic regression to the development of new potentials and highlight how models discovered for one element can be used to seed new searches for different elements.

翻译：近年来，机器学习算法在开发原子间势模型方面取得了显著进展。机器学习势模型通常比密度泛函理论快几个数量级，但比基于物理的模型（如嵌入原子法）慢几个数量级。在我们先前的研究中，利用符号回归为铜开发了具有类似嵌入原子法新颖函数形式的快速、准确且可迁移的原子间势模型。为确定这些形式适用于铜的成功程度是否具有特异性，本文探索了这些模型对其他面心立方过渡金属的普适性，并分析了其在多种材料特性上的样本外表现。我们发现，这些形式对化学性质与铜相似的元素尤其有效。与复杂度相近的优化Sutton-Chen模型相比，符号回归发现的函数形式在所考虑的所有元素（除金外，两者性能相近）上均表现更优。在训练所针对的特性上，其表现与中等复杂度的嵌入原子形式相当，而在其他特性上平均精度更高。我们将这种广义精度的提升归因于符号回归发现的模型相对简洁。在多种特性预测中，遗传编程模型约50%情况下优于文献中的其他模型，而平均模型复杂度仅为后者的约十分之一。我们讨论了这些结果对符号回归更广泛应用于开发新势函数的启示，并重点说明了如何将针对某一元素发现的模型用于引导不同元素的搜索。