We introduce Latent Space Distribution Matching (LSDM), a novel framework for semi-supervised generative modeling of conditional distributions. LSDM operates in two stages: (i) learning a low-dimensional latent space from both paired and unpaired data, and (ii) performing joint distribution matching in this space via the 1-Wasserstein distance, using only paired data. This two-step approach minimizes an upper bound on the 1-Wasserstein distance between joint distributions, reducing reliance on scarce paired samples while enabling fast one-step generation. Theoretically, we establish non-asymptotic error bounds and demonstrate a key benefit of unpaired data: enhanced geometric fidelity in generated outputs. Furthermore, by extending the scope of its two core steps, LSDM provides a coherent statistical perspective that connects to a broad class of latent-space approaches. Notably, Latent Diffusion Models (LDMs) can be viewed as a variant of LSDM, in which joint distribution matching is achieved indirectly via score matching. Consequently, our results also provide theoretical insights into the consistency of LDMs. Empirical evaluations on real-world image tasks, including class-conditional generation and image super-resolution, demonstrate the effectiveness of LSDM in leveraging unpaired data to enhance generation quality.
翻译:我们提出了潜在空间分布匹配(LSDM),这是一种用于条件分布半监督生成建模的新框架。LSDM分两个阶段运行:(i)从配对和非配对数据中学习一个低维潜在空间;(ii)在此空间中,仅使用配对数据,通过1-Wasserstein距离进行联合分布匹配。这种两步法最小化了联合分布之间1-Wasserstein距离的上界,减少了对稀缺配对样本的依赖,同时实现了快速一步生成。理论上,我们建立了非渐近误差界,并证明了非配对数据的一个关键优势:增强了生成输出的几何保真度。此外,通过扩展其两个核心步骤的范围,LSDM提供了一个连贯的统计视角,与一大类潜在空间方法相联系。值得注意的是,潜在扩散模型(LDMs)可被视为LSDM的一种变体,其中联合分布匹配是通过分数匹配间接实现的。因此,我们的结果也为LDMs的一致性提供了理论见解。在真实世界图像任务(包括类条件生成和图像超分辨率)上的实证评估表明,LSDM在利用非配对数据提升生成质量方面具有有效性。