In recent years, molecular representation learning has emerged as a key area of focus in various chemical tasks. However, many existing models fail to fully consider the geometric information of molecular structures, resulting in less intuitive representations. Moreover, the widely used message-passing mechanism is limited to provide the interpretation of experimental results from a chemical perspective. To address these challenges, we introduce a novel Transformer-based framework for molecular representation learning, named the Geometry-aware Transformer (GeoT). GeoT learns molecular graph structures through attention-based mechanisms specifically designed to offer reliable interpretability, as well as molecular property prediction. Consequently, GeoT can generate attention maps of interatomic relationships associated with training objectives. In addition, GeoT demonstrates comparable performance to MPNN-based models while achieving reduced computational complexity. Our comprehensive experiments, including an empirical simulation, reveal that GeoT effectively learns the chemical insights into molecular structures, bridging the gap between artificial intelligence and molecular sciences.
翻译:近年来,分子表征学习已成为各类化学任务中的关键研究领域。然而,现有诸多模型未能充分考虑分子结构的几何信息,导致表征结果缺乏直观性。此外,广泛使用的消息传递机制在从化学角度解释实验结果方面存在局限性。为应对这些挑战,我们提出了一种基于Transformer的新型分子表征学习框架——几何感知Transformer(GeoT)。GeoT通过专门设计以提供可靠可解释性的注意力机制,学习分子图结构并实现分子性质预测。因此,GeoT能够生成与训练目标相关的原子间关系注意力图。此外,GeoT在实现与基于MPNN模型相当性能的同时,降低了计算复杂度。包括实证模拟在内的综合实验表明,GeoT有效学习了分子结构中的化学洞察,弥合了人工智能与分子科学之间的鸿沟。