Accurately predicting protein melting temperature changes (Delta Tm) is fundamental for assessing protein stability and guiding protein engineering. Leveraging multi-modal protein representations has shown great promise in capturing the complex relationships among protein sequences, structures, and functions. In this study, we develop models based on powerful protein language models, including ESM-2, ESM-3 and AlphaFold, using various feature extraction methods to enhance prediction accuracy. By utilizing the ESM-3 model, we achieve a new state-of-the-art performance on the s571 test dataset, obtaining a Pearson correlation coefficient (PCC) of 0.50. Furthermore, we conduct a fair evaluation to compare the performance of different protein language models in the Delta Tm prediction task. Our results demonstrate that integrating multi-modal protein representations could advance the prediction of protein melting temperatures.
翻译:准确预测蛋白质熔解温度变化(ΔTm)是评估蛋白质稳定性及指导蛋白质工程的基础。利用多模态蛋白质表征在捕捉蛋白质序列、结构与功能间的复杂关系方面展现出巨大潜力。本研究基于包括ESM-2、ESM-3和AlphaFold在内的强大蛋白质语言模型,采用多种特征提取方法构建预测模型以提升预测精度。通过运用ESM-3模型,我们在s571测试数据集上取得了当前最优性能,获得了0.50的皮尔逊相关系数(PCC)。此外,我们开展了公平评估以比较不同蛋白质语言模型在ΔTm预测任务中的表现。结果表明,整合多模态蛋白质表征能够推动蛋白质熔解温度的预测研究。