In this study, we present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores, thereby optimizing high-throughput virtual screening (HTVS) in drug discovery. The architecture integrates Principal Neighborhood Aggregation (PNA) and Learnable Random Walk Positional Encodings (LRWPE), enhancing the model's ability to understand complex molecular structures and their relationship with their respective docking scores. This approach significantly surpasses traditional HTVS methods and recent Graph Neural Network (GNN) models in both recovery and efficiency due to a wider coverage of the chemical space and enhanced performance. Our results demonstrate that ScoreFormer achieves competitive performance in docking score prediction and offers a substantial 1.65-fold reduction in inference time compared to existing models. We evaluated ScoreFormer across multiple datasets under various conditions, confirming its robustness and reliability in identifying potential drug candidates rapidly.
翻译:本研究提出ScoreFormer,一种新颖的图Transformer模型,旨在准确预测分子对接分数,从而优化药物发现中的高通量虚拟筛选。该架构整合了主邻域聚合与可学习随机游走位置编码,增强了模型理解复杂分子结构及其与对应对接分数关系的能力。由于对化学空间更广的覆盖和更强的性能,该方法在富集率和效率上均显著超越传统高通量虚拟筛选方法与近期图神经网络模型。我们的结果表明,ScoreFormer在对接分数预测中实现了有竞争力的性能,且与现有模型相比推理时间大幅减少1.65倍。我们在多种条件下跨多个数据集评估了ScoreFormer,证实了其在快速识别潜在候选药物方面的鲁棒性与可靠性。