The power and flexibility of Optimal Transport (OT) have pervaded a wide spectrum of problems, including recent Machine Learning challenges such as unsupervised domain adaptation. Its essence of quantitatively relating two probability distributions by some optimal metric, has been creatively exploited and shown to hold promise for many real-world data challenges. In a related theme in the present work, we posit that domain adaptation robustness is rooted in the intrinsic (latent) representations of the respective data, which are inherently lying in a non-linear submanifold embedded in a higher dimensional Euclidean space. We account for the geometric properties by refining the $l^2$ Euclidean metric to better reflect the geodesic distance between two distinct representations. We integrate a metric correction term as well as a prior cluster structure in the source data of the OT-driven adaptation. We show that this is tantamount to an implicit Bayesian framework, which we demonstrate to be viable for a more robust and better-performing approach to domain adaptation. Substantiating experiments are also included for validation purposes.
翻译:最优传输(OT)的强大功能和灵活性已渗透到广泛问题中,包括最近的无监督域适应等机器学习挑战。其通过某种最优度量定量关联两个概率分布的本质已被创造性地挖掘,并展现出应对许多实际数据挑战的潜力。在本文的相关主题中,我们提出域适应的稳健性根植于各自数据的固有(潜在)表示,这些表示天然位于嵌入高维欧氏空间中的非线性子流形上。我们通过改进 $l^2$ 欧氏度量以更好反映两个不同表示之间的测地距离,从而考虑其几何特性。我们在基于OT的适应中整合了度量修正项以及源数据中的先验聚类结构。我们证明这等效于一个隐式贝叶斯框架,并论证其能够实现更稳健、性能更优的域适应方法。文中还包含验证性实验。