Peptides offer great biomedical potential and serve as promising drug candidates. Currently, the majority of approved peptide drugs are directly derived from well-explored natural human peptides. It is quite necessary to utilize advanced deep learning techniques to identify novel peptide drugs in the vast, unexplored biochemical space. Despite various in silico methods having been developed to accelerate peptide early drug discovery, existing models face challenges of overfitting and lacking generalizability due to the limited size, imbalanced distribution and inconsistent quality of experimental data. In this study, we propose PepGB, a deep learning framework to facilitate peptide early drug discovery by predicting peptide-protein interactions (PepPIs). Employing graph neural networks, PepGB incorporates a fine-grained perturbation module and a dual-view objective with contrastive learning-based peptide pre-trained representation to predict PepPIs. Through rigorous evaluations, we demonstrated that PepGB greatly outperforms baselines and can accurately identify PepPIs for novel targets and peptide hits, thereby contributing to the target identification and hit discovery processes. Next, we derive an extended version, diPepGB, to tackle the bottleneck of modeling highly imbalanced data prevalent in lead generation and optimization processes. Utilizing directed edges to represent relative binding strength between two peptide nodes, diPepGB achieves superior performance in real-world assays. In summary, our proposed frameworks can serve as potent tools to facilitate peptide early drug discovery.
翻译:肽类具有巨大的生物医学潜力,是极具前景的药物候选分子。目前,大多数获批的肽类药物直接来源于经过充分探索的天然人体肽类。因此,亟需利用先进深度学习技术,在广阔未开发的生化空间中识别新型肽类药物。尽管已有多种计算方法被开发用于加速肽类早期药物发现,但现有模型因实验数据规模有限、分布不平衡及质量不一致等问题,面临过拟合与泛化能力不足的挑战。本研究提出深度学习框架PepGB,通过预测肽-蛋白质相互作用(PepPIs)来促进肽类早期药物发现。该框架采用图神经网络,融合细粒度扰动模块与基于对比学习肽预训练表示的双视角目标函数,实现PepPI预测。通过严格评估,我们证明PepGB显著优于基线方法,能准确识别针对新靶标和候选肽的PepPIs,从而助力靶标识别与先导发现过程。进一步,我们开发扩展版本diPepGB,以解决先导化合物优化和生成阶段普遍存在的高度不平衡数据建模瓶颈。通过引入有向边表示两个肽节点间的相对结合强度,diPepGB在实际生物测定中展现出优越性能。综上,本研究所提框架可作为促进肽类早期药物发现的有力工具。