Developing meaningful and efficient representations that separate the fundamental structure of the data generation mechanism is crucial in representation learning. However, Disentangled Representation Learning has not fully shown its potential on real images, because of correlated generative factors, their resolution and limited access to ground truth labels. Specifically on the latter, we investigate the possibility of leveraging synthetic data to learn general-purpose disentangled representations applicable to real data, discussing the effect of fine-tuning and what properties of disentanglement are preserved after the transfer. We provide an extensive empirical study to address these issues. In addition, we propose a new interpretable intervention-based metric, to measure the quality of factors encoding in the representation. Our results indicate that some level of disentanglement, transferring a representation from synthetic to real data, is possible and effective.
翻译:在表示学习中,开发能够分离数据生成机制基本结构的、有意义且高效的表示至关重要。然而,解耦表示学习尚未在真实图像上充分展现其潜力,这主要归因于生成因素之间的相关性、其分辨率以及真实标注数据的有限获取。针对后者,我们特别探讨了利用合成数据学习适用于真实数据的通用解耦表示的可能性,并分析了微调的影响以及迁移后解耦特性的保持情况。我们通过广泛的实证研究来探讨这些问题。此外,我们提出了一种新的基于可解释干预的度量方法,用于评估表示中因素编码的质量。我们的结果表明,将表示从合成数据迁移到真实数据时,实现一定程度的解耦是可行且有效的。