High-dimensional generative modeling is fundamentally a manifold-learning problem: real data concentrate near a low-dimensional structure embedded in the ambient space. Effective generators must therefore balance support fidelity -- placing probability mass near the data manifold -- with sampling efficiency. Diffusion models often capture near-manifold structure but require many iterative denoising steps and can leak off-support; normalizing flows sample in one pass but are limited by invertibility and dimension preservation. We propose MAGT (Manifold-Aligned Generative Transport), a flow-like generator that learns a one-shot, manifold-aligned transport from a low-dimensional base distribution to the data space. Training is performed at a fixed Gaussian smoothing level, where the score is well-defined and numerically stable. We approximate this fixed-level score using a finite set of latent anchor points with self-normalized importance sampling, yielding a tractable objective. MAGT samples in a single forward pass, concentrates probability near the learned support, and induces an intrinsic density with respect to the manifold volume measure, enabling principled likelihood evaluation for generated samples. We establish finite-sample Wasserstein bounds linking smoothing level and score-approximation accuracy to generative fidelity, and empirically improve fidelity and manifold concentration across synthetic and benchmark datasets while sampling substantially faster than diffusion models.
翻译:高维生成建模本质上是一个流形学习问题:真实数据集中在嵌入环境空间的低维结构附近。因此,有效的生成器必须在支撑保真度(将概率质量置于数据流形附近)与采样效率之间取得平衡。扩散模型通常能捕捉近流形结构,但需要多次迭代去噪步骤且可能产生离支撑泄漏;归一化流可单次采样,但受限于可逆性与维度保持。我们提出MAGT(流形对齐生成传输),这是一种类流生成器,它学习从低维基分布到数据空间的单次、流形对齐传输。训练在固定的高斯平滑水平下进行,此时得分函数定义良好且数值稳定。我们通过使用带自归一化重要性采样的有限潜在锚点集来近似该固定水平得分,从而得到可处理的优化目标。MAGT通过单次前向传播进行采样,将概率集中在学习到的支撑附近,并诱导出关于流形体积测度的本征密度,从而实现对生成样本的严格似然评估。我们建立了有限样本Wasserstein界,将平滑水平与得分近似精度同生成保真度联系起来,并在合成与基准数据集上实证提升了保真度与流形集中性,同时采样速度显著快于扩散模型。