The maximum mean discrepancy and Wasserstein distance are popular distance measures between distributions and play important roles in many machine learning problems such as metric learning, generative modeling, domain adaption, and clustering. However, since they are functions of pair-wise distances between data points in two distributions, they do not exploit the potential manifold properties of data such as smoothness and hence are not effective in measuring the dissimilarity between the two distributions in the form of manifolds. In this paper, different from existing measures, we propose a novel distance called Mutual Regression Distance (MRD) induced by a constrained mutual regression problem, which can exploit the manifold property of data. We prove that MRD is a pseudometric that satisfies almost all the axioms of a metric. Since the optimization of the original MRD is costly, we provide a tight MRD and a simplified MRD, based on which a heuristic algorithm is established. We also provide kernel variants of MRDs that are more effective in handling nonlinear data. Our MRDs especially the simplified MRDs have much lower computational complexity than the Wasserstein distance. We provide theoretical guarantees, such as robustness, for MRDs. Finally, we apply MRDs to distribution clustering, generative models, and domain adaptation. The numerical results demonstrate the effectiveness and superiority of MRDs compared to the baselines.
翻译:最大均值差异与Wasserstein距离是衡量分布间距离的常用度量,在度量学习、生成建模、域适应和聚类等诸多机器学习问题中发挥着重要作用。然而,由于它们基于两个分布中数据点之间的成对距离进行计算,未能充分利用数据潜在的流形特性(如平滑性),因此在衡量流形形式下的两个分布间差异时效果有限。本文提出了一种与传统度量方法不同的新型距离——互回归距离,该距离由带约束的互回归问题诱导产生,能够有效利用数据的流形特性。我们证明了MRD是一种满足度量几乎所有公理的伪度量。由于原始MRD的优化计算成本较高,我们提出了紧致MRD与简化MRD,并基于此建立了启发式算法。同时,我们还给出了能更有效处理非线性数据的核化MRD变体。我们的MRD(尤其是简化MRD)相比Wasserstein距离具有显著降低的计算复杂度。我们为MRD提供了鲁棒性等理论保证。最后,我们将MRD应用于分布聚类、生成模型和域适应任务。数值实验结果表明,与基线方法相比,MRD具有优越的有效性。