Deep generative models like VAEs and diffusion models have advanced various generation tasks by leveraging latent variables to learn data distributions and generate high-quality samples. Despite the field of explainable AI making strides in interpreting machine learning models, understanding latent variables in generative models remains challenging. This paper introduces LatentExplainer, a framework for automatically generating semantically meaningful explanations of latent variables in deep generative models. LatentExplainer tackles three main challenges: inferring the meaning of latent variables, aligning explanations with inductive biases, and handling varying degrees of explainability. By perturbing latent variables and interpreting changes in generated data, the framework provides a systematic approach to understanding and controlling the data generation process, enhancing the transparency and interpretability of deep generative models. We evaluate our proposed method on several real-world and synthetic datasets, and the results demonstrate superior performance in generating high-quality explanations of latent variables.
翻译:变分自编码器(VAE)和扩散模型等深度生成模型通过利用潜在变量学习数据分布并生成高质量样本,推动了各类生成任务的进展。尽管可解释人工智能领域在解释机器学习模型方面取得了显著进步,但理解生成模型中的潜在变量仍具挑战性。本文提出LatentExplainer框架,用于自动生成深度生成模型中潜在变量的语义化解释。该框架主要解决三大挑战:推断潜在变量含义、使解释与归纳偏置对齐,以及处理不同层级的可解释性。通过对潜在变量施加扰动并解释生成数据的变化,该框架为理解和控制数据生成过程提供了系统化方法,从而增强了深度生成模型的透明度和可解释性。我们在多个真实世界和合成数据集上评估了所提方法,结果表明其在生成高质量潜在变量解释方面具有优越性能。