Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning. Solving it may unlock other problems, such as generalization, interpretability, or fairness. Although remarkably challenging to solve in theory, disentanglement is often achieved in practice through prior matching. Furthermore, recent works have shown that prior matching approaches can be enhanced by leveraging geometrical considerations, e.g., by learning representations that preserve geometric features of the data, such as distances or angles between points. However, matching the prior while preserving geometric features is challenging, as a mapping that fully preserves these features while aligning the data distribution with the prior does not exist in general. To address these challenges, we introduce a novel approach to disentangled representation learning based on quadratic optimal transport. We formulate the problem using Gromov-Monge maps that transport one distribution onto another with minimal distortion of predefined geometric features, preserving them as much as can be achieved. To compute such maps, we propose the Gromov-Monge-Gap (GMG), a regularizer quantifying whether a map moves a reference distribution with minimal geometry distortion. We demonstrate the effectiveness of our approach for disentanglement across four standard benchmarks, outperforming other methods leveraging geometric considerations.
翻译:从无标注数据中学习解耦表示是机器学习中的一个基础性挑战。解决该问题可能为其他问题(如泛化性、可解释性或公平性)提供突破口。尽管从理论上解决该问题极具挑战性,但在实践中通常通过先验匹配来实现解耦。此外,近期研究表明,通过引入几何考量(例如学习能够保持数据几何特征——如点间距离或角度——的表示)可以增强先验匹配方法的效果。然而,在保持几何特征的同时匹配先验具有挑战性,因为通常不存在能够完全保持这些特征并将数据分布与先验对齐的映射。为应对这些挑战,我们提出了一种基于二次最优传输的解耦表示学习新方法。我们利用Gromov-Monge映射构建问题框架,该映射以最小化预定义几何特征失真的方式将一个分布传输到另一个分布,从而最大限度地保持这些特征。为计算此类映射,我们提出了Gromov-Monge间隙(GMG)——一种量化映射是否以最小几何失真移动参考分布的正则化器。我们在四个标准基准测试中验证了该方法在解耦任务上的有效性,其性能优于其他利用几何考量的方法。