Antibodies are crucial proteins produced by the immune system in response to foreign substances or antigens. The specificity of an antibody is determined by its complementarity-determining regions (CDRs), which are located in the variable domains of the antibody chains and form the antigen-binding site. Previous studies have utilized complex techniques to generate CDRs, but they suffer from inadequate geometric modeling. Moreover, the common iterative refinement strategies lead to an inefficient inference. In this paper, we propose a \textit{simple yet effective} model that can co-design 1D sequences and 3D structures of CDRs in a one-shot manner. To achieve this, we decouple the antibody CDR design problem into two stages: (i) geometric modeling of protein complex structures and (ii) sequence-structure co-learning. We develop a novel macromolecular structure invariant embedding, typically for protein complexes, that captures both intra- and inter-component interactions among the backbone atoms, including C$\alpha$, N, C, and O atoms, to achieve comprehensive geometric modeling. Then, we introduce a simple cross-gate MLP for sequence-structure co-learning, allowing sequence and structure representations to implicitly refine each other. This enables our model to design desired sequences and structures in a one-shot manner. Extensive experiments are conducted to evaluate our results at both the sequence and structure levels, which demonstrate that our model achieves superior performance compared to the state-of-the-art antibody CDR design methods.
翻译:抗体是免疫系统响应外来物质或抗原时产生的关键蛋白质。抗体的特异性由其互补决定区(CDRs)决定,这些区域位于抗体链的可变域中,并形成抗原结合位点。先前的研究采用复杂技术生成CDRs,但存在几何建模不充分的问题。此外,常见的迭代细化策略导致推理效率低下。本文提出一种**简洁而有效**的模型,能够以一步式方式协同设计CDRs的一维序列和三维结构。为此,我们将抗体CDR设计问题解耦为两个阶段:(i)蛋白质复合物结构的几何建模,以及(ii)序列-结构协同学习。我们开发了一种新型大分子结构不变嵌入方法,专门针对蛋白质复合物,捕获主链原子(包括Cα、N、C和O原子)间的组件内和组件间相互作用,以实现全面的几何建模。随后,我们引入一个简单的跨门控MLP进行序列-结构协同学习,使序列和结构表示能够隐式地相互优化。这使我们的模型能够以一步式方式设计所需的序列和结构。我们在序列和结构层面进行了大量实验来评估结果,表明我们的模型在性能上优于最先进的抗体CDR设计方法。