In the Variational Autoencoder (VAE), the variational posterior often aligns closely with the prior, which is known as posterior collapse and hinders the quality of representation learning. To mitigate this problem, an adjustable hyperparameter beta has been introduced in the VAE. This paper presents a closed-form expression to assess the relationship between the beta in VAE, the dataset size, the posterior collapse, and the rate-distortion curve by analyzing a minimal VAE in a high-dimensional limit. These results clarify that a long plateau in the generalization error emerges with a relatively larger beta. As the beta increases, the length of the plateau extends and then becomes infinite beyond a certain beta threshold. This implies that the choice of beta, unlike the usual regularization parameters, can induce posterior collapse regardless of the dataset size. Thus, beta is a risky parameter that requires careful tuning. Furthermore, considering the dataset-size dependence on the rate-distortion curve, a relatively large dataset is required to obtain a rate-distortion curve with high rates. Extensive numerical experiments support our analysis.
翻译:在变分自编码器(VAE)中,变分后验分布常与先验分布高度重合,这一现象称为后验坍塌,会阻碍表征学习的质量。为缓解此问题,VAE引入了一个可调超参数β。本文通过分析高维极限下的最小VAE,给出了一个闭式表达式来评估VAE中β、数据集大小、后验坍塌及率失真曲线之间的关系。这些结果阐明:当β相对较大时,泛化误差会出现长平台期。随着β增大,平台期长度延伸,并在超过某个β阈值后变为无限长。这意味着与常规正则化参数不同,β的选择可能独立于数据集大小而引发后验坍塌。因此,β是一个需要谨慎调校的风险参数。此外,考虑率失真曲线对数据集大小的依赖性时,需要较大的数据集才能获得高采样率的率失真曲线。大量数值实验验证了我们的分析。