Denoising diffusion probabilistic models (DDPMs) have recently taken the field of generative modeling by storm, pioneering new state-of-the-art results in disciplines such as computer vision and computational biology for diverse tasks ranging from text-guided image generation to structure-guided protein design. Along this latter line of research, methods such as those of Hoogeboom et al. 2022 have been proposed for generating 3D molecules using equivariant graph neural networks (GNNs) within a DDPM framework. Toward this end, we propose GCDM, a geometry-complete diffusion model that achieves new state-of-the-art results for 3D molecule diffusion generation and optimization by leveraging the representation learning strengths offered by GNNs that perform geometry-complete message-passing. Our results with GCDM also offer preliminary insights into how physical inductive biases impact the generative dynamics of molecular DDPMs. The source code, data, and instructions to train new models or reproduce our results are freely available at https://github.com/BioinfoMachineLearning/Bio-Diffusion.
翻译:近年来,去噪扩散概率模型(DDPM)席卷了生成建模领域,在计算机视觉和计算生物学等学科中开创了从文本引导图像生成到结构引导蛋白质设计等多项任务的最新成果。沿着蛋白质设计这一研究方向,Hoogeboom等人(2022)提出了利用等变图神经网络(GNN)在DDPM框架内生成3D分子的方法。为此,我们提出GCDM——一种几何完备的扩散模型,通过利用执行几何完备消息传递的GNN在表示学习上的优势,在3D分子扩散生成与优化任务中取得了新的最优结果。GCDM的实验结果还初步揭示了物理归纳偏置如何影响分子DDPM的生成动力学。训练新模型或复现结果的源代码、数据及说明已开源:https://github.com/BioinfoMachineLearning/Bio-Diffusion。