Diffusion-based generative models have reformed generative AI, and have enabled new capabilities in the science domain, for example, generating 3D structures of molecules. Due to the intrinsic problem structure of certain tasks, there is often a symmetry in the system, which identifies objects that can be converted by a group action as equivalent, hence the target distribution is essentially defined on the quotient space with respect to the group. In this work, we establish a formal framework for diffusion modeling on a general quotient space, and apply it to molecular structure generation which follows the special Euclidean group $\text{SE}(3)$ symmetry. The framework reduces the necessity of learning the component corresponding to the group action, hence simplifies learning difficulty over conventional group-equivariant diffusion models, and the sampler guarantees recovering the target distribution, while heuristic alignment strategies lack proper samplers. The arguments are empirically validated on structure generation for small molecules and proteins, indicating that the principled quotient-space diffusion model provides a new framework that outperforms previous symmetry treatments.
翻译:基于扩散的生成模型彻底改变了生成式人工智能,并在科学领域实现了新能力,例如生成分子的三维结构。由于某些任务的内在问题结构,系统中通常存在对称性,该对称性将通过群作用相互转换的对象视为等价,因此目标分布本质上定义在关于该群的商空间上。在这项工作中,我们建立了在一般商空间上进行扩散建模的形式化框架,并将其应用于遵循特殊欧几里得群 $\text{SE}(3)$ 对称性的分子结构生成。该框架减少了学习与群作用对应的分量的必要性,从而简化了相对于传统群等变扩散模型的学习难度,并且采样器保证恢复目标分布,而启发式对齐策略缺乏合适的采样器。这些论点在小分子和蛋白质的结构生成上得到了实证验证,表明基于原则的商空间扩散模型提供了一种超越先前对称性处理的新框架。