Learning disentangled representations with variational autoencoders (VAEs) is often attributed to the regularisation component of the loss. In this work, we highlight the interaction between data and the reconstruction term of the loss as the main contributor to disentanglement in VAEs. We show that standard benchmark datasets have unintended correlations between their subjective ground-truth factors and perceived axes in the data according to typical VAE reconstruction losses. Our work exploits this relationship to provide a theory for what constitutes an adversarial dataset under a given reconstruction loss. We verify this by constructing an example dataset that prevents disentanglement in state-of-the-art frameworks while maintaining human-intuitive ground-truth factors. Finally, we re-enable disentanglement by designing an example reconstruction loss that is once again able to perceive the ground-truth factors. Our findings demonstrate the subjective nature of disentanglement and the importance of considering the interaction between the ground-truth factors, data and notably, the reconstruction loss, which is under-recognised in the literature.
翻译:变分自编码器(VAE)学习解耦表示通常归因于损失函数中的正则化分量。本工作揭示了数据与损失函数中重建项之间的相互作用是VAE解耦的主要贡献因素。研究表明,标准基准数据集在主观真实因子与根据典型VAE重建损失感知到的数据轴之间,存在非预期的关联性。我们利用这种关系提出了一种理论,阐明在给定重建损失下何种数据集构成对抗性数据集。通过构建一个示例数据集——该数据集在保持人类直观真实因子的同时阻碍了最先进框架的解耦能力——验证了这一理论。最后,我们通过设计一种能够重新感知真实因子的示例重建损失,重新实现了解耦。我们的发现揭示了解耦的主观本质,以及考虑真实因子、数据与重建损失之间交互作用的重要性——这一交互作用在文献中尚未得到充分认识。