Generative networks have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such as natural images and signals, exhibit low-dimensional geometric structures. In this paper, we take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold. We prove statistical guarantees of generative networks under the Wasserstein-1 loss. We show that the Wasserstein-1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension. Our theory leverages the low-dimensional geometric structures in data sets and justifies the practical power of generative networks. We require no smoothness assumptions on the data distribution which is desirable in practice.
翻译:生成网络在分布学习中取得了巨大的实证成功。大量现有实验表明,生成网络能够从低维易采样的分布中生成高维复杂数据。然而,现有理论无法解释这一现象。广泛接受的流形假说推测,真实数据集(如自然图像和信号)具有低维几何结构。本文通过假设数据分布支持在低维权形上,将这种低维数据结构纳入考量。我们证明了在Wasserstein-1损失下生成网络的统计保证。研究表明,Wasserstein-1损失以依赖于内在维度而非环境数据维度的快速速率收敛至零。我们的理论利用了数据集中的低维几何结构,并验证了生成网络的实际能力。该理论无需对数据分布进行光滑性假设,这在实践中具有理想性。