Machine learning models have become firmly established across all scientific fields. Extracting features from data and making inferences based on them with neural network models often yields high accuracy; however, this approach has several drawbacks. Symbolic regression is a powerful technique for discovering analytical equations that describe data, providing interpretable and generalizable models capable of predicting unseen data. Symbolic regression methods have gained new momentum with the advancement of neural network technologies and offer several advantages, the main one being the interpretability of results. In this work, we examined the application of the deep symbolic regression algorithm SEGVAE to determine the properties of two-dimensional materials with defects. Comparing the results with state-of-the-art graph neural network-based methods shows comparable or, in some cases, even identical outcomes. We also discuss the applicability of this class of methods in natural sciences.
翻译:机器学习模型已在所有科学领域牢固确立。从数据中提取特征并利用神经网络模型进行推断通常能获得高精度;然而,该方法存在若干缺陷。符号回归是一种发现描述数据的解析方程的强大技术,它能提供可解释且可泛化的模型,能够预测未见数据。随着神经网络技术的进步,符号回归方法获得了新的发展动力,并具有多项优势,其中最主要的是结果的可解释性。在本研究中,我们考察了深度符号回归算法SEGVAE在确定含缺陷二维材料性质方面的应用。将结果与基于图神经网络的最先进方法进行比较,显示出相当甚至在某些情况下完全一致的结果。我们还讨论了此类方法在自然科学中的适用性。