Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. $\mathbf{FABind}$ incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed $\mathbf{FABind}$ demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods. Our code is available at $\href{https://github.com/QizhiPei/FABind}{Github}$.
翻译:建模蛋白质与配体之间的相互作用并准确预测其结合结构是药物发现中一项关键且具有挑战性的任务。近年来,深度学习的进展为解决这一挑战带来了希望,其中基于采样和基于回归的方法已成为两种主流途径。然而,这些方法存在明显的局限性。基于采样的方法因需要生成多个候选结构进行筛选而常面临效率低下的问题;另一方面,基于回归的方法虽能提供快速预测,但可能牺牲准确性。此外,蛋白质尺寸的差异通常需要借助外部模块选择合适的结合口袋,进一步影响效率。在本研究中,我们提出$\mathbf{FABind}$端到端模型,该模型融合口袋预测与对接,以实现准确且快速的蛋白质-配体结合。$\mathbf{FABind}$集成了一个独特的配体引导的口袋预测模块,并利用该模块进行对接姿态估计。该模型通过逐步整合预测的口袋来优化蛋白质-配体结合,从而减少训练与推理之间的差异,进一步增强了对接过程。在基准数据集上的大量实验表明,与现有方法相比,我们提出的$\mathbf{FABind}$在有效性和效率方面均展现出显著优势。我们的代码已开源在$\href{https://github.com/QizhiPei/FABind}{Github}$上。