Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks

Discovering novel drug candidate molecules is one of the most fundamental and critical steps in drug development. Generative deep learning models, which create synthetic data given a probability distribution, have been developed with the purpose of picking completely new samples from a partially known space. Generative models offer high potential for designing de novo molecules; however, in order for them to be useful in real-life drug development pipelines, these models should be able to design target-specific molecules, which is the next step in this field. In this study, we propose DrugGEN, for the de novo design of drug candidate molecules that interact with selected target proteins. The proposed system represents compounds and protein structures as graphs and processes them via serially connected two generative adversarial networks comprising graph transformers. DrugGEN is trained using a large dataset of compounds from ChEMBL and target-specific bioactive molecules, to design effective and specific inhibitory molecules against the AKT1 protein, which has critical importance for developing treatments against various types of cancer. On fundamental benchmarks, DrugGEN models have either competitive or better performance against other methods. To assess the target-specific generation performance, we conducted further in silico analysis with molecular docking and deep learning-based bioactivity prediction. Results indicate that de novo molecules have high potential for interacting with the AKT1 protein structure in the level of its native ligand. DrugGEN can be used to design completely novel and effective target-specific drug candidate molecules for any druggable protein, given target features and a dataset of experimental bioactivities. Code base, datasets, results and trained models of DrugGEN are available at https://github.com/HUBioDataLab/DrugGEN

翻译：发现新型药物候选分子是药物开发中最基本且关键的步骤之一。基于概率分布生成合成数据的深度学习生成模型已被开发出来，旨在从部分已知空间中选取全新的样本。生成模型在从头设计分子方面具有巨大潜力；然而，为了使其在现实药物开发流程中发挥作用，这些模型应能设计靶向特定分子，这是该领域的下一步目标。在本研究中，我们提出DrugGEN，用于从头设计能与选定靶标蛋白相互作用的药物候选分子。所提出的系统将化合物和蛋白质结构表示为图，并通过串行连接的两个由图Transformer组成的生成对抗网络进行处理。DrugGEN利用来自ChEMBL的大型化合物数据集和靶向特异性生物活性分子进行训练，以设计针对AKT1蛋白（该蛋白对开发多种癌症治疗药物具有关键重要性）的有效且特异性抑制分子。在基础基准测试中，DrugGEN模型的性能与其他方法相比具有竞争力或更优。为评估靶向特异性生成性能，我们进一步通过分子对接和基于深度学习的生物活性预测进行了计算机模拟分析。结果表明，从头设计的分子在与AKT1蛋白结构相互作用方面具有与其天然配体相当的高潜力。DrugGEN可用于针对任何可成药蛋白，在给定靶标特征和实验生物活性数据集的情况下，设计全新且有效的靶向特异性药物候选分子。DrugGEN的代码库、数据集、结果及训练模型可在https://github.com/HUBioDataLab/DrugGEN获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

GNN在几何深度学习有何进展？斯坦福CS224W《几何深度学习》课程报告，DeepMind大牛Petar主讲，附112页ppt

专知会员服务

54+阅读 · 2021年12月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日