Discovering novel drug candidate molecules is one of the most fundamental and critical steps in drug development. Generative deep learning models, which create synthetic data given a probability distribution, have been developed with the purpose of picking completely new samples from a partially known space. Generative models offer high potential for designing de novo molecules; however, in order for them to be useful in real-life drug development pipelines, these models should be able to design target-specific molecules, which is the next step in this field. In this study, we propose DrugGEN, for the de novo design of drug candidate molecules that interact with selected target proteins. The proposed system represents compounds and protein structures as graphs and processes them via serially connected two generative adversarial networks comprising graph transformers. DrugGEN is trained using a large dataset of compounds from ChEMBL and target-specific bioactive molecules, to design effective and specific inhibitory molecules against the AKT1 protein, which has critical importance for developing treatments against various types of cancer. On fundamental benchmarks, DrugGEN models have either competitive or better performance against other methods. To assess the target-specific generation performance, we conducted further in silico analysis with molecular docking and deep learning-based bioactivity prediction. Results indicate that de novo molecules have high potential for interacting with the AKT1 protein structure in the level of its native ligand. DrugGEN can be used to design completely novel and effective target-specific drug candidate molecules for any druggable protein, given target features and a dataset of experimental bioactivities. Code base, datasets, results and trained models of DrugGEN are available at https://github.com/HUBioDataLab/DrugGEN
翻译:发现新型药物候选分子是药物开发中最基础且关键的步骤之一。生成式深度学习模型能够根据概率分布生成合成数据,其设计初衷是从部分已知空间中挑选全新样本。生成模型在设计全新分子方面具有巨大潜力;然而,若要使其在实际药物开发流程中发挥作用,这些模型应具备设计靶向特异性分子的能力,这是该领域的下一步研究方向。在本研究中,我们提出DrugGEN,用于全新设计能与选定靶蛋白相互作用的药物候选分子。该系统将化合物和蛋白质结构表示为图,并通过串联连接的两个生成对抗网络(包含图变换器)进行处理。DrugGEN使用来自ChEMBL的大型化合物数据集和靶向特异性生物活性分子进行训练,以设计针对AKT1蛋白的有效且特异性抑制分子——该蛋白对于开发多种癌症治疗策略具有关键意义。在基础基准测试中,DrugGEN模型的表现与其他方法相比具有竞争力或更优。为评估靶向特异性生成性能,我们进一步进行了分子对接和基于深度学习的生物活性预测等计算分析。结果表明,全新设计的分子在与AKT1蛋白结构相互作用方面,其潜力可达天然配体水平。DrugGEN可用于针对任何可成药蛋白设计全新的、有效的靶向特异性药物候选分子,前提是提供靶标特征和实验生物活性数据集。DrugGEN的代码库、数据集、结果及训练模型可通过https://github.com/HUBioDataLab/DrugGEN获取。