Variational autoencoder (VAE) estimates the posterior parameters (mean and variance) of latent variables corresponding to each input data. While it is used for many tasks, the transparency of the model is still an underlying issue. This paper provides a quantitative understanding of VAE property through the differential geometric and information-theoretic interpretations of VAE. According to the Rate-distortion theory, the optimal transform coding is achieved by using an orthonormal transform with PCA basis where the transform space is isometric to the input. Considering the analogy of transform coding to VAE, we clarify theoretically and experimentally that VAE can be mapped to an implicit isometric embedding with a scale factor derived from the posterior parameter. As a result, we can estimate the data probabilities in the input space from the prior, loss metrics, and corresponding posterior parameters, and further, the quantitative importance of each latent variable can be evaluated like the eigenvalue of PCA.
翻译:变分自编码器(VAE)估计每个输入数据对应的潜在变量的后验参数(均值和方差)。尽管其被广泛应用于诸多任务,但模型的透明性仍是一个根本性问题。本文通过微分几何与信息论视角对VAE进行解读,进而提供对其特性的定量理解。根据率失真理论,采用基于PCA基的正交变换可实现最优变换编码,此时变换空间与输入空间保持等距。通过类比变换编码与VAE,我们从理论上和实验上阐明:VAE可以映射为一个隐式等距嵌入,其缩放因子由后验参数导出。基于此,我们能够从先验分布、损失度量及相应的后验参数中估计输入空间的数据概率;此外,每个潜在变量的定量重要性可像PCA特征值那样进行评估。