Nature creates diverse proteins through a 'divide and assembly' strategy. Inspired by this idea, we introduce ProteinWeaver, a two-stage framework for protein backbone design. Our method first generates individual protein domains and then employs an SE(3) diffusion model to flexibly assemble these domains. A key challenge lies in the assembling step, given the complex and rugged nature of the inter-domain interaction landscape. To address this challenge, we employ preference alignment to discern complex relationships between structure and interaction landscapes through comparative analysis of generated samples. Comprehensive experiments demonstrate that ProteinWeaver: (1) generates high-quality, novel protein backbones through versatile domain assembly; (2) outperforms RFdiffusion, the current state-of-the-art in backbone design, by 13\% and 39\% for long-chain proteins; (3) shows the potential for cooperative function design through illustrative case studies. To sum up, by introducing a `divide-and-assembly' paradigm, ProteinWeaver advances protein engineering and opens new avenues for functional protein design.
翻译:自然界通过“分而组装”策略创造出多样化的蛋白质。受此启发,我们提出了ProteinWeaver——一个用于蛋白质骨架设计的两阶段框架。我们的方法首先生成单个蛋白质结构域,然后采用SE(3)扩散模型灵活地组装这些结构域。鉴于结构域间相互作用景观的复杂性与崎岖性,组装步骤面临关键挑战。为应对这一挑战,我们采用偏好对齐技术,通过对生成样本的比较分析来辨识结构与相互作用景观之间的复杂关系。综合实验表明,ProteinWeaver能够:(1)通过多功能结构域组装生成高质量的新型蛋白质骨架;(2)对于长链蛋白质,其性能超越当前骨架设计领域最先进方法RFdiffusion达13%和39%;(3)通过示例案例研究展示了协同功能设计的潜力。总而言之,通过引入“分而组装”范式,ProteinWeaver推动了蛋白质工程的发展,并为功能性蛋白质设计开辟了新途径。