Variational autoencoders (VAEs) frequently suffer from posterior collapse, where latent variables become uninformative and the approximate posterior degenerates to the prior. Recent work has characterized this phenomenon as a phase transition governed by the spectral properties of the data covariance matrix. In this paper, we propose a fundamentally different approach: instead of avoiding collapse through architectural constraints or hyperparameter tuning, we eliminate the possibility of collapse altogether by leveraging the multiplicity of Gaussian mixture model (GMM) clusterings. We introduce Historical Consensus Training, an iterative selection procedure that progressively refines a set of candidate GMM priors through alternating optimization and selection. The key insight is that models trained to satisfy multiple distinct clustering constraints develop a historical barrier -- a region in parameter space that remains stable even when subsequently trained with a single objective. We prove that this barrier excludes the collapsed solution, and demonstrate through extensive experiments on synthetic and real-world datasets that our method achieves non-collapsed representations regardless of decoder variance or regularization strength. Our approach requires no explicit stability conditions (e.g., $σ^{\prime 2} < λ_{\max}$) and works with arbitrary neural architectures. The code is available at https://github.com/tsegoochang/historical-consensus-vae.
翻译:变分自编码器(VAE)常受后验坍塌问题困扰,即潜在变量失去信息性且近似后验退化为先验分布。近期研究将这一现象描述为由数据协方差矩阵谱性质控制的相变。本文提出一种根本不同的方法:我们不再通过架构约束或超参数调优来避免坍塌,而是利用高斯混合模型(GMM)聚类簇的多样性来彻底消除坍塌的可能性。我们提出历史共识训练——一种通过交替优化与选择逐步精化候选GMM先验集合的迭代选择流程。其核心洞见在于:经过多个不同聚类约束训练的模型会形成历史屏障,即参数空间中即使后续使用单一目标训练仍能保持稳定的区域。我们证明该屏障能排除坍塌解,并通过合成与真实数据集的广泛实验表明:无论解码器方差或正则化强度如何,本方法均能获得非坍塌表示。我们的方法无需显式稳定性条件(例如 $σ^{\prime 2} < λ_{\max}$),且适用于任意神经网络架构。代码发布于 https://github.com/tsegoochang/historical-consensus-vae。