Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. $\mathbf{FABind}$ incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed $\mathbf{FABind}$ demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods. Our code is available at $\href{https://github.com/QizhiPei/FABind}{Github}$.
翻译:建模蛋白质与配体之间的相互作用并准确预测其结合结构,是药物发现中一项关键且具有挑战性的任务。近年来,深度学习在这一问题上展现出潜力,其中基于采样和基于回归的方法成为两种主流途径。然而,这些方法存在显著局限性:基于采样的方法因需要生成多个候选结构进行筛选而效率低下;基于回归的方法虽预测速度快,但可能牺牲准确性。此外,蛋白质尺寸的差异常需外部模块辅助选择合适结合口袋,进一步影响效率。本文提出 $\mathbf{FABind}$,一种结合口袋预测与对接的端到端模型,可实现蛋白质-配体的准确快速结合。$\mathbf{FABind}$ 整合了独特的配体引导口袋预测模块,该模块同时用于对接姿态估计。模型进一步通过逐步整合预测口袋来优化蛋白质-配体结合过程,减少了训练与推理之间的差异。在基准数据集上的大量实验表明,与现有方法相比,我们提出的 $\mathbf{FABind}$ 在有效性和效率方面具有显著优势。我们的代码已发布于 $\href{https://github.com/QizhiPei/FABind}{Github}$。