Since their inception, Variational Autoencoders (VAEs) have become central in machine learning. Despite their widespread use, numerous questions regarding their theoretical properties remain open. Using PAC-Bayesian theory, this work develops statistical guarantees for VAEs. First, we derive the first PAC-Bayesian bound for posterior distributions conditioned on individual samples from the data-generating distribution. Then, we utilize this result to develop generalization guarantees for the VAE's reconstruction loss, as well as upper bounds on the distance between the input and the regenerated distributions. More importantly, we provide upper bounds on the Wasserstein distance between the input distribution and the distribution defined by the VAE's generative model.
翻译:自变分自编码器(VAEs)问世以来,它们已成为机器学习领域的核心模型。尽管应用广泛,但其理论性质仍存在诸多未解问题。本研究借助PAC-Bayesian理论,为VAEs建立了统计保证。首先,我们针对数据生成分布中单个样本条件化的后验分布,推导了首个PAC-Bayesian界。随后,利用该结果,我们为VAE的重构损失建立了泛化保证,并给出了输入分布与再生分布之间距离的上界。更重要的是,我们提供了输入分布与VAE生成模型所定义分布之间的Wasserstein距离上界。