We present VoxBind, a new score-based generative model for 3D molecules conditioned on protein structures. Our approach represents molecules as 3D atomic density grids and leverages a 3D voxel-denoising network for learning and generation. We extend the neural empirical Bayes formalism (Saremi & Hyvarinen, 2019) to the conditional setting and generate structure-conditioned molecules with a two-step procedure: (i) sample noisy molecules from the Gaussian-smoothed conditional distribution with underdamped Langevin MCMC using the learned score function and (ii) estimate clean molecules from the noisy samples with single-step denoising. Compared to the current state of the art, our model is simpler to train, significantly faster to sample from, and achieves better results on extensive in silico benchmarks -- the generated molecules are more diverse, exhibit fewer steric clashes, and bind with higher affinity to protein pockets. The code is available at https://github.com/genentech/voxbind/.
翻译:我们提出了VoxBind,一种基于蛋白质结构条件生成三维分子的新型评分生成模型。该方法将分子表示为三维原子密度网格,并利用三维体素去噪网络进行学习与生成。我们将神经经验贝叶斯形式体系(Saremi & Hyvarinen, 2019)扩展至条件生成场景,通过两步流程生成结构条件化分子:(i)使用习得的评分函数,通过欠阻尼朗之万MCMC从高斯平滑的条件分布中采样含噪声分子;(ii)通过单步去噪从噪声样本中估计纯净分子。与当前最优方法相比,本模型训练更简单、采样速度显著更快,并在大规模计算模拟基准测试中取得更优结果——生成的分子多样性更高、空间位阻冲突更少、与蛋白质口袋的结合亲和力更强。代码发布于 https://github.com/genentech/voxbind/。