Discovering novel drug candidate molecules is one of the most fundamental and critical steps in drug development. Generative deep learning models, which create synthetic data given a probability distribution, have been developed with the purpose of picking completely new samples from a partially known space. Generative models offer high potential for designing de novo molecules; however, in order for them to be useful in real-life drug development pipelines, these models should be able to design target-specific molecules, which is the next step in this field. In this study, we propose DrugGEN, for the de novo design of drug candidate molecules that interact with selected target proteins. The proposed system represents compounds and protein structures as graphs and processes them via serially connected two generative adversarial networks comprising graph transformers. DrugGEN is trained using a large dataset of compounds from ChEMBL and target-specific bioactive molecules, to design effective and specific inhibitory molecules against the AKT1 protein, which has critical importance for developing treatments against various types of cancer. On fundamental benchmarks, DrugGEN models have either competitive or better performance against other methods. To assess the target-specific generation performance, we conducted further in silico analysis with molecular docking and deep learning-based bioactivity prediction. Results indicate that de novo molecules have high potential for interacting with the AKT1 protein structure in the level of its native ligand. DrugGEN can be used to design completely novel and effective target-specific drug candidate molecules for any druggable protein, given target features and a dataset of experimental bioactivities. Code base, datasets, results and trained models of DrugGEN are available at https://github.com/HUBioDataLab/DrugGEN
翻译:发现新型药物候选分子是药物开发中最基本且关键的步骤之一。基于概率分布生成合成数据的深度学习生成模型已被开发出来,旨在从部分已知空间中选取全新的样本。生成模型在从头设计分子方面具有巨大潜力;然而,为了使其在现实药物开发流程中发挥作用,这些模型应能设计靶向特定分子,这是该领域的下一步目标。在本研究中,我们提出DrugGEN,用于从头设计能与选定靶标蛋白相互作用的药物候选分子。所提出的系统将化合物和蛋白质结构表示为图,并通过串行连接的两个由图Transformer组成的生成对抗网络进行处理。DrugGEN利用来自ChEMBL的大型化合物数据集和靶向特异性生物活性分子进行训练,以设计针对AKT1蛋白(该蛋白对开发多种癌症治疗药物具有关键重要性)的有效且特异性抑制分子。在基础基准测试中,DrugGEN模型的性能与其他方法相比具有竞争力或更优。为评估靶向特异性生成性能,我们进一步通过分子对接和基于深度学习的生物活性预测进行了计算机模拟分析。结果表明,从头设计的分子在与AKT1蛋白结构相互作用方面具有与其天然配体相当的高潜力。DrugGEN可用于针对任何可成药蛋白,在给定靶标特征和实验生物活性数据集的情况下,设计全新且有效的靶向特异性药物候选分子。DrugGEN的代码库、数据集、结果及训练模型可在https://github.com/HUBioDataLab/DrugGEN获取。