Score-based generative modeling (SBGM) has achieved state-of-the-art performance in image generation, with the quality of generated images being highly dependent on the design of the forward (diffusion) process. Among these, models based on stochastic differential equations (SDEs) have proven particularly effective. While traditional methods aim to progressively destroy all image information to enable reconstruction from pure noise, we propose a class of anisotropic stochastic partial differential equations (SPDEs) that preserve the geometric structure of the data over longer time scales throughout the transformation. These SPDEs consist of a drift term that enforces deterministic destruction via structured smoothing, and a diffusion coefficient that enables random destruction through noise injection. Both components are governed by anisotropy coefficients, enabling controlled, direction-dependent information degradation. This framework provides the theoretical foundation for a novel anisotropic score-based generative model. By retaining geometric structure for longer time scales, the backward generative process can exploit residual geometric cues, leading to improved reconstruction fidelity. We empirically validate this improvement in a proof-of-concept implementation on unconditional image generation, showing that anisotropic diffusion can achieve superior image quality metrics. We demonstrate consistent improvements in both pixel and latent space experiments over the SDE-driven baseline as well as over the state-of-the-art Flow Matching approach. Finally, we demonstrate the effectiveness of the introduced anisotropy in a conditional stroke-to-image generation task.
翻译:得分生成式建模(SBGM)在图像生成领域取得了当前最优性能,其生成图像质量高度依赖于正向(扩散)过程的设计。其中,基于随机微分方程(SDEs)的模型已被证明尤为有效。传统方法旨在逐步摧毁所有图像信息以实现从纯噪声重建,而我们提出一类各向异性随机偏微分方程(SPDEs),能在整个变换过程中长时间维持数据的几何结构。这些SPDEs包含一个通过结构平滑实现确定性摧毁的漂移项,以及一个通过噪声注入实现随机摧毁的扩散系数——两者均由各向异性系数控制,从而实现受控的、方向依赖的信息退化。该框架为新型各向异性得分生成式模型提供了理论基础。由于几何结构得以长时间保留,逆向生成过程可利用残余几何线索,从而提升重建保真度。我们在无条件图像生成的概念验证实现中实证验证了这一改进,表明各向异性扩散可取得更优的图像质量指标。我们持续证明,在像素空间和潜在空间的实验中,该方法均优于基于SDE的基准方法以及当前最优的流匹配方法。最后,我们展示了所引入各向异性在条件式笔画到图像生成任务中的有效性。