The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorporate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.
翻译:机器学习在材料性质预测与发现中的应用传统上集中于图神经网络,这类网络需整合所有原子的几何构型。然而在实际应用中,这些信息并非总能即时获取,例如评估吸附物与催化剂之间未知的结合模式时。本文研究在OC20数据集中,能否在忽略吸附物相对于电催化剂位置的情况下预测体系的弛豫能量。我们以SchNet、DimeNet++和FAENet作为基础架构,通过四种修改方案评估其对模型性能的影响:移除输入图中的边、独立池化表征、不共享主干网络权重以及利用注意力机制传播非几何相对信息。研究发现,尽管移除结合位点信息会如预期般降低精度,但修改后的模型仍能以相当可观的平均绝对误差预测弛豫能量。本研究为加速材料发现领域提出了新的研究方向——在此类研究中,反应物构型信息可被简化甚至完全省略。