Denoising diffusion probabilistic models (DDPMs) have recently taken the field of generative modeling by storm, pioneering new state-of-the-art results in disciplines such as computer vision and computational biology for diverse tasks ranging from text-guided image generation to structure-guided protein design. Along this latter line of research, methods such as those of Hoogeboom et al. 2022 have been proposed for unconditionally generating 3D molecules using equivariant graph neural networks (GNNs) within a DDPM framework. Toward this end, we propose GCDM, a geometry-complete diffusion model that achieves new state-of-the-art results for 3D molecule diffusion generation by leveraging the representation learning strengths offered by GNNs that perform geometry-complete message-passing. Our results with GCDM also offer preliminary insights into how physical inductive biases impact the generative dynamics of molecular DDPMs. The source code, data, and instructions to train new models or reproduce our results are freely available at https://github.com/BioinfoMachineLearning/bio-diffusion.
翻译:去噪扩散概率模型(DDPMs)近期在生成建模领域掀起热潮,在计算机视觉和计算生物学等学科中,从文本引导图像生成到结构引导蛋白质设计等多样化任务上,开创了新的最先进成果。沿着这一研究方向,Hoogeboom等人(2022)提出的方法利用DDPM框架中的等变图神经网络(GNNs)进行无条件的3D分子生成。为此,我们提出GCDM——一种几何完备扩散模型,通过利用执行几何完备消息传递的GNN的表征学习优势,在3D分子扩散生成中实现新的最先进成果。我们的GCDM结果还初步揭示了物理归纳偏置如何影响分子DDPMs的生成动力学。用于训练新模型或复现我们结果的源代码、数据及说明均可在https://github.com/BioinfoMachineLearning/bio-diffusion 免费获取。