Antibodies are crucial proteins produced by the immune system in response to foreign substances or antigens. The specificity of an antibody is determined by its complementarity-determining regions (CDRs), which are located in the variable domains of the antibody chains and form the antigen-binding site. Previous studies have utilized complex techniques to generate CDRs, but they suffer from inadequate geometric modeling. Moreover, the common iterative refinement strategies lead to an inefficient inference. In this paper, we propose a \textit{simple yet effective} model that can co-design 1D sequences and 3D structures of CDRs in a one-shot manner. To achieve this, we decouple the antibody CDR design problem into two stages: (i) geometric modeling of protein complex structures and (ii) sequence-structure co-learning. We develop a novel macromolecular structure invariant embedding, typically for protein complexes, that captures both intra- and inter-component interactions among the backbone atoms, including C$\alpha$, N, C, and O atoms, to achieve comprehensive geometric modeling. Then, we introduce a simple cross-gate MLP for sequence-structure co-learning, allowing sequence and structure representations to implicitly refine each other. This enables our model to design desired sequences and structures in a one-shot manner. Extensive experiments are conducted to evaluate our results at both the sequence and structure levels, which demonstrate that our model achieves superior performance compared to the state-of-the-art antibody CDR design methods.
翻译:抗体是免疫系统针对外来物质或抗原产生的关键蛋白质。抗体的特异性由其互补决定区(CDR)决定,这些区域位于抗体链的可变结构域中,构成抗原结合位点。以往研究利用复杂技术生成CDR,但存在几何建模不充分的问题。此外,常见的迭代优化策略导致推理效率低下。本文提出一种**简单而高效**的模型,能够一次性协同设计CDR的一维序列与三维结构。为实现这一目标,我们将抗体CDR设计问题分解为两个阶段:(i)蛋白质复合物结构的几何建模和(ii)序列-结构协同学习。我们开发了一种新型大分子结构不变嵌入方法,特别针对蛋白质复合物,通过捕获主链原子(包括Cα、N、C和O)的组件内与组件间相互作用,实现全面的几何建模。随后,我们引入一种简单的跨门控MLP进行序列-结构协同学习,使序列和结构表示能够隐式地相互优化。这使我们的模型能够一次性设计所需的序列与结构。我们在序列和结构层面进行了大量实验评估,结果表明,与最先进的抗体CDR设计方法相比,我们的模型实现了更优的性能。