Motivation: The clinical efficacy of antibody therapeutics critically depends on high-affinity target engagement, yet laboratory affinity-maturation campaigns are slow and costly. In computational settings, most protein language models (PLMs) are not trained to favor high-affinity antibodies, and existing preference optimization approaches introduce substantial computational overhead without clear affinity gains. Therefore, this work proposes SimBinder-IF, which converts the inverse folding model ESM-IF into an antibody sequence generator by freezing its structure encoder and training only its decoder to prefer experimentally stronger binders through preference optimization. Results: On the 11-assay AbBiBench benchmark, SimBinder-IF achieves a 55 percent relative improvement in mean Spearman correlation between log-likelihood scores and experimentally measured binding affinity compared to vanilla ESM-IF (from 0.264 to 0.410). In zero-shot generalization across four unseen antigen-antibody complexes, the correlation improves by 156 percent (from 0.115 to 0.294). SimBinder-IF also outperforms baselines in top-10 precision for ten-fold or greater affinity improvements. A case study redesigning antibody F045-092 for A/California/04/2009 (pdmH1N1) shows that SimBinder-IF proposes variants with substantially lower predicted binding free energy changes than ESM-IF (mean Delta Delta G -75.16 vs -46.57). Notably, SimBinder-IF trains only about 18 percent of the parameters of the full ESM-IF model, highlighting its parameter efficiency for high-affinity antibody generation.


翻译:动机:抗体疗法的临床疗效关键取决于高亲和力的靶标结合,然而实验室的亲和力成熟过程缓慢且成本高昂。在计算环境中,大多数蛋白质语言模型(PLMs)并非为偏好高亲和力抗体而训练,且现有的偏好优化方法引入了大量计算开销,却未带来明确的亲和力提升。因此,本研究提出SimBinder-IF,该方法通过冻结逆折叠模型ESM-IF的结构编码器,并仅训练其解码器,使其通过偏好优化倾向于实验验证的更强结合体,从而将ESM-IF转化为抗体序列生成器。结果:在包含11项测定的AbBiBench基准测试中,与原始ESM-IF相比,SimBinder-IF在对数似然得分与实验测量的结合亲和力之间的平均斯皮尔曼相关性上实现了55%的相对提升(从0.264提升至0.410)。在四个未见过的抗原-抗体复合物的零样本泛化中,相关性提升了156%(从0.115提升至0.294)。在亲和力提升十倍或以上的前10位序列精度方面,SimBinder-IF也优于基线模型。一项针对A/California/04/2009(pdmH1N1)重新设计抗体F045-092的案例研究表明,SimBinder-IF提出的变体,其预测的结合自由能变化显著低于ESM-IF(平均ΔΔG为-75.16对比-46.57)。值得注意的是,SimBinder-IF仅训练了完整ESM-IF模型约18%的参数,突显了其在高亲和力抗体生成方面的参数效率。

0
下载
关闭预览

相关内容

AAAI 2022 | ProtGNN:自解释图神经网络
专知
10+阅读 · 2022年2月28日
ISI新研究:胶囊生成对抗网络
论智
18+阅读 · 2018年3月7日
国家自然科学基金
0+阅读 · 2014年12月31日
国家自然科学基金
0+阅读 · 2014年12月31日
VIP会员
相关基金
Top
微信扫码咨询专知VIP会员