We propose a comprehensive sample-based method for assessing the quality of generative models. The proposed approach enables the estimation of the probability that two sets of samples are drawn from the same distribution, providing a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models trained on the same dataset. This comparison can be conducted by dividing the space into non-overlapping regions and comparing the number of data samples in each region. The method only requires samples from the generative model and the test data. It is capable of functioning directly on high-dimensional data, obviating the need for dimensionality reduction. Significantly, the proposed method does not depend on assumptions regarding the density of the true distribution, and it does not rely on training or fitting any auxiliary models. Instead, it focuses on approximating the integral of the density (probability mass) across various sub-regions within the data space.
翻译:我们提出了一种基于样本的综合方法,用于评估生成模型的质量。该方法能够估计两组样本来自同一分布的概率,从而为评估单一生成模型的性能或比较在同一数据集上训练的多个竞争模型提供统计严谨的手段。比较过程通过将空间划分为互不重叠的区域,并统计每个区域内的数据样本数量来实现。该方法仅需生成模型和测试数据的样本,可直接在高维数据上运行,无需降维。值得注意的是,该方法不依赖于真实分布密度的假设,也无需训练或拟合任何辅助模型,而是专注于近似数据空间中各个子区域的密度积分(即概率质量)。