Developing meaningful and efficient representations that separate the fundamental structure of the data generation mechanism is crucial in representation learning. However, Disentangled Representation Learning has not fully shown its potential on real images, because of correlated generative factors, their resolution and limited access to ground truth labels. Specifically on the latter, we investigate the possibility of leveraging synthetic data to learn general-purpose disentangled representations applicable to real data, discussing the effect of fine-tuning and what properties of disentanglement are preserved after the transfer. We provide an extensive empirical study to address these issues. In addition, we propose a new interpretable intervention-based metric, to measure the quality of factors encoding in the representation. Our results indicate that some level of disentanglement, transferring a representation from synthetic to real data, is possible and effective.
翻译:在表示学习中,开发能够分离数据生成机制基本结构的有意义且高效的表示至关重要。然而,解耦表示学习尚未在真实图像上充分展现其潜力,这归因于生成因素的相关性、其分辨率以及真实标签的有限可及性。针对后者,我们研究了利用合成数据学习适用于真实数据的通用解耦表示的可能性,讨论了微调的影响以及迁移后解耦特性的保留情况。我们通过广泛的实证研究来探讨这些问题。此外,我们提出了一种新的基于可解释干预的度量方法,用于衡量表示中因素编码的质量。我们的结果表明,将表示从合成数据迁移到真实数据时,一定程度的解耦是可能且有效的。