Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. $\mathbf{FABind}$ incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed $\mathbf{FABind}$ demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods. Our code is available at https://github.com/QizhiPei/FABind
翻译:建模蛋白质与配体之间的相互作用并准确预测其结合结构,是药物发现中一项关键但具有挑战性的任务。近年来,深度学习的进展为解决这一挑战带来了希望,其中基于采样和基于回归的方法成为两种主流途径。然而,这些方法存在显著局限性:基于采样的方法因需要生成多个候选结构以供筛选而常效率低下;基于回归的方法虽能快速预测,但可能降低准确性。此外,蛋白质尺寸的变化往往需要依赖外部模块来选择合适结合口袋,进一步影响效率。本研究提出$\mathbf{FABind}$——一种结合口袋预测与对接的端到端模型,以实现精准快速的蛋白质-配体结合。$\mathbf{FABind}$包含独特的配体信息引导的口袋预测模块,该模块同时用于对接姿态估计;模型通过逐步整合预测口袋来优化蛋白质-配体结合,减少训练与推理间的差异。在基准数据集上的大量实验表明,与现有方法相比,我们提出的$\mathbf{FABind}$在有效性和效率方面展现出显著优势。代码发布于https://github.com/QizhiPei/FABind。