Retrosynthesis plays a crucial role in the fields of organic synthesis and drug development, where the goal is to identify suitable reactants that can yield a target product molecule. Although existing methods have achieved notable success, they typically overlook the 3D conformational details and internal spatial organization of molecules. This oversight makes it challenging to predict reactants that conform to genuine chemical principles, particularly when dealing with complex molecular structures, such as polycyclic and heteroaromatic compounds. In response to this challenge, we introduce a novel transformer-based, template-free approach that incorporates 3D conformer data and spatial information. Our approach includes an Atom-align Fusion module that integrates 3D positional data at the input stage, ensuring correct alignment between atom tokens and their respective 3D coordinates. Additionally, we propose a Distance-weighted Attention mechanism that refines the self-attention process, constricting the model s focus to relevant atom pairs in 3D space. Extensive experiments on the USPTO-50K dataset demonstrate that our model outperforms previous template-free methods, setting a new benchmark for the field. A case study further highlights our method s ability to predict reasonable and accurate reactants.
翻译:逆合成在有机合成和药物开发领域扮演着至关重要的角色,其目标是识别能够生成目标产物分子的合适反应物。尽管现有方法已取得显著成功,但它们通常忽略了分子的三维构象细节和内部空间结构。这种疏忽使得预测符合真实化学原理的反应物变得具有挑战性,尤其是在处理复杂的分子结构(如多环和杂芳族化合物)时。针对这一挑战,我们提出了一种新颖的、基于Transformer的无模板方法,该方法融合了三维构象数据和空间信息。我们的方法包含一个原子对齐融合模块,该模块在输入阶段整合三维位置数据,确保原子标记与其对应的三维坐标之间的正确对齐。此外,我们提出了一种距离加权注意力机制,该机制优化了自注意力过程,将模型的注意力限制在三维空间中相关的原子对上。在USPTO-50K数据集上进行的大量实验表明,我们的模型优于先前的无模板方法,为该领域设立了新的基准。一项案例研究进一步凸显了我们方法预测合理且准确反应物的能力。