Accurately predicting protein melting temperature changes (Delta Tm) is fundamental for assessing protein stability and guiding protein engineering. Leveraging multi-modal protein representations has shown great promise in capturing the complex relationships among protein sequences, structures, and functions. In this study, we develop models based on powerful protein language models, including ESM-2, ESM-3 and AlphaFold, using various feature extraction methods to enhance prediction accuracy. By utilizing the ESM-3 model, we achieve a new state-of-the-art performance on the s571 test dataset, obtaining a Pearson correlation coefficient (PCC) of 0.50. Furthermore, we conduct a fair evaluation to compare the performance of different protein language models in the Delta Tm prediction task. Our results demonstrate that integrating multi-modal protein representations could advance the prediction of protein melting temperatures.
翻译:准确预测蛋白质熔解温度变化(ΔTm)是评估蛋白质稳定性与指导蛋白质工程的基础。利用多模态蛋白质表征在捕捉蛋白质序列、结构与功能间复杂关系方面展现出巨大潜力。本研究基于包括ESM-2、ESM-3和AlphaFold在内的强大蛋白质语言模型,采用多种特征提取方法开发预测模型以提升预测精度。通过运用ESM-3模型,我们在s571测试数据集上取得了当前最优性能,获得皮尔逊相关系数(PCC)0.50。此外,我们进行了公平评估以比较不同蛋白质语言模型在ΔTm预测任务中的表现。研究结果表明,整合多模态蛋白质表征能够推动蛋白质熔解温度的预测研究。