Understanding how well a deep generative model captures a distribution of high-dimensional data remains an important open challenge. It is especially difficult for certain model classes, such as Generative Adversarial Networks and Diffusion Models, whose models do not admit exact likelihoods. In this work, we demonstrate that generalized empirical likelihood (GEL) methods offer a family of diagnostic tools that can identify many deficiencies of deep generative models (DGMs). We show, with appropriate specification of moment conditions, that the proposed method can identify which modes have been dropped, the degree to which DGMs are mode imbalanced, and whether DGMs sufficiently capture intra-class diversity. We show how to combine techniques from Maximum Mean Discrepancy and Generalized Empirical Likelihood to create not only distribution tests that retain per-sample interpretability, but also metrics that include label information. We find that such tests predict the degree of mode dropping and mode imbalance up to 60% better than metrics such as improved precision/recall. We provide an implementation at https://github.com/deepmind/understanding_deep_generative_models_with_generalized_empirical_likelihood/.
翻译:理解深度生成模型如何捕捉高维数据分布仍是重要且开放的研究挑战。对于某些特定模型类别(如生成对抗网络和扩散模型),其模型不提供精确似然函数,这使得评估尤其困难。本研究证明,广义经验似然方法可提供一套诊断工具,能够识别深度生成模型的多种缺陷。我们展示了在适当设定矩条件的前提下,该方法可识别被丢弃的模态、深度生成模型模态失衡的程度,以及模型是否充分捕获类内多样性。我们进一步阐释如何结合最大均值差异与广义经验似然技术,不仅构建出保留逐样本可解释性的分布检验,还生成了包含标签信息的评估指标。实验证明,此类检验对模态丢弃与模态失衡程度的预测能力,较改进型精确率/召回率等指标提升高达60%。我们已在https://github.com/deepmind/understanding_deep_generative_models_with_generalized_empirical_likelihood/ 提供相关实现代码。