Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model the pre-trained features probabilistically as linearly entangled combinations of the latent content and style factors and develop a simple disentanglement algorithm based on the probabilistic model. We show that the method provably disentangles content and style features and verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements when the distribution shift occurs due to style changes or style-related spurious correlations.
翻译:学习具有可解释特征的视觉表示(即解耦表示)仍是一项具有挑战性的问题。现有方法虽取得一定成功,但难以应用于ImageNet等大规模视觉数据集。本文提出一种简单的后处理框架,从预训练视觉模型习得的表示中解耦内容与风格。我们将预训练特征概率性地建模为潜在内容与风格因子的线性纠缠组合,并基于该概率模型开发了一种简单的解耦算法。我们证明该方法可理论保证地解耦内容与风格特征,并通过实验验证其有效性。当分布偏移源于风格变化或与风格相关的虚假关联时,经后处理的特征能显著提升领域泛化性能。