Protein inverse folding, the task of predicting amino acid sequences for desired structures, is pivotal for de novo protein design. However, existing GNN-based methods typically suffer from restricted receptive fields that miss long-range dependencies and a "single-pass" inference paradigm that leads to error accumulation. To address these bottlenecks, we propose RIGA-Fold, a framework that synergizes Recurrent Interaction with Geometric Awareness. At the micro-level, we introduce a Geometric Attention Update (GAU) module where edge features explicitly serve as attention keys, ensuring strictly SE(3)-invariant local encoding. At the macro-level, we design an attention-based Global Context Bridge that acts as a soft gating mechanism to dynamically inject global topological information. Furthermore, to bridge the gap between structural and sequence modalities, we introduce an enhanced variant, RIGA-Fold*, which integrates trainable geometric features with frozen evolutionary priors from ESM-2 and ESM-IF via a dual-stream architecture. Finally, a biologically inspired ``predict-recycle-refine'' strategy is implemented to iteratively denoise sequence distributions. Extensive experiments on CATH 4.2, TS50, and TS500 benchmarks demonstrate that our geometric framework is highly competitive, while RIGA-Fold* significantly outperforms state-of-the-art baselines in both sequence recovery and structural consistency.
翻译:蛋白质逆折叠,即预测目标结构对应氨基酸序列的任务,是蛋白质从头设计的关键。然而,现有的基于图神经网络的方法通常存在感受野受限而忽略长程依赖,以及“单次”推理范式导致误差累积的问题。为解决这些瓶颈,我们提出了RIGA-Fold,一个协同融合循环交互与几何感知的框架。在微观层面,我们引入了几何注意力更新模块,其中边特征显式地作为注意力键,确保了严格的SE(3)不变局部编码。在宏观层面,我们设计了一个基于注意力的全局上下文桥接模块,作为一个软门控机制,动态注入全局拓扑信息。此外,为弥合结构与序列模态之间的鸿沟,我们引入了一个增强变体RIGA-Fold*,它通过双流架构将可训练的几何特征与来自ESM-2和ESM-IF的冻结进化先验知识相结合。最后,我们实现了一种受生物学启发的“预测-循环-精修”策略,以迭代地对序列分布进行去噪。在CATH 4.2、TS50和TS500基准测试上的大量实验表明,我们的几何框架具有高度竞争力,而RIGA-Fold*在序列恢复率和结构一致性方面均显著优于最先进的基线方法。