Antibodies are widely used as therapeutics, but their development requires costly affinity maturation, involving iterative mutations to enhance binding affinity.This paper explores a sequence-only scenario for affinity maturation, using solely antibody and antigen sequences. Recently AlphaFlow wraps AlphaFold within flow matching to generate diverse protein structures, enabling a sequence-conditioned generative model of structure. Building on this, we propose an alternating optimization framework that (1) fixes the sequence to guide structure generation toward high binding affinity using a structure-based affinity predictor, then (2) applies inverse folding to create sequence mutations, refined by a sequence-based affinity predictor for post selection. A key challenge is the lack of labeled data for training both predictors. To address this, we develop a co-teaching module that incorporates valuable information from noisy biophysical energies into predictor refinement. The sequence-based predictor selects consensus samples to teach the structure-based predictor, and vice versa. Our method, AffinityFlow, achieves state-of-the-art performance in affinity maturation experiments. We plan to open-source our code after acceptance.
翻译:抗体作为治疗药物被广泛应用,但其开发过程需要成本高昂的亲和力成熟步骤,涉及通过迭代突变增强结合亲和力。本文探索了一种仅基于序列的亲和力成熟方案,仅使用抗体和抗原序列。近期AlphaFlow将AlphaFold嵌入流匹配框架以生成多样化的蛋白质结构,实现了结构在序列条件下的生成建模。在此基础上,我们提出一种交替优化框架:(1)固定序列,利用基于结构的亲和力预测器引导结构生成朝向高结合亲和力;(2)应用逆向折叠技术产生序列突变,并通过基于序列的亲和力预测器进行后选择优化。关键挑战在于缺乏用于训练两种预测器的标注数据。为此,我们开发了协同教学模块,将来自噪声生物物理能量的有价值信息整合到预测器优化中。基于序列的预测器选择共识样本指导基于结构的预测器,反之亦然。我们的方法AffinityFlow在亲和力成熟实验中实现了最先进的性能。我们计划在论文录用后开源代码。