One major challenge of disentanglement learning with variational autoencoders is the trade-off between disentanglement and reconstruction fidelity. Previous studies, which increase the information bottleneck during training, tend to lose the constraint of disentanglement, leading to the information diffusion problem. In this paper, we present a novel framework for disentangled representation learning, DeVAE, which utilizes hierarchical latent spaces with decreasing information bottlenecks across these spaces. The key innovation of our approach lies in connecting the hierarchical latent spaces through disentanglement-invariant transformations, allowing the sharing of disentanglement properties among spaces while maintaining an acceptable level of reconstruction performance. We demonstrate the effectiveness of DeVAE in achieving a balance between disentanglement and reconstruction through a series of experiments and ablation studies on dSprites and Shapes3D datasets. Code is available at https://github.com/erow/disentanglement_lib/tree/pytorch#devae.
翻译:变分自编码器在解耦学习中的一个主要挑战是解耦与重建保真度之间的权衡。以往研究在训练过程中增大信息瓶颈,往往导致解耦约束失效,进而引发信息扩散问题。本文提出了一种新颖的解耦表示学习框架DeVAE,该框架利用分层潜空间,并在这些空间中采用递减的信息瓶颈。我们方法的核心创新在于通过解耦不变变换连接分层潜空间,使得解耦特性能够在各空间之间共享,同时保持可接受的重建性能。通过在dSprites和Shapes3D数据集上的一系列实验和消融研究,我们证明了DeVAE在实现解耦与重建平衡方面的有效性。代码可在https://github.com/erow/disentanglement_lib/tree/pytorch#devae获取。