A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions

The $2$-Wasserstein distance is sensitive to minor geometric differences between distributions, making it a very powerful dissimilarity metric. However, due to this sensitivity, a small outlier mass can also cause a significant increase in the $2$-Wasserstein distance between two similar distributions. Similarly, sampling discrepancy can cause the empirical $2$-Wasserstein distance on $n$ samples in $\mathbb{R}^2$ to converge to the true distance at a rate of $n^{-1/4}$, which is significantly slower than the rate of $n^{-1/2}$ for $1$-Wasserstein distance. We introduce a new family of distances parameterized by $k \ge 0$, called $k$-RPW that is based on computing the partial $2$-Wasserstein distance. We show that (1) $k$-RPW satisfies the metric properties, (2) $k$-RPW is robust to small outlier mass while retaining the sensitivity of $2$-Wasserstein distance to minor geometric differences, and (3) when $k$ is a constant, $k$-RPW distance between empirical distributions on $n$ samples in $\mathbb{R}^2$ converges to the true distance at a rate of $n^{-1/3}$, which is faster than the convergence rate of $n^{-1/4}$ for the $2$-Wasserstein distance. Using the partial $p$-Wasserstein distance, we extend our distance to any $p \in [1,\infty]$. By setting parameters $k$ or $p$ appropriately, we can reduce our distance to the total variation, $p$-Wasserstein, and the L\'evy-Prokhorov distances. Experiments show that our distance function achieves higher accuracy in comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances for image retrieval tasks on noisy real-world data sets.

翻译：$2$-Wasserstein距离对分布间的细微几何差异非常敏感，这使其成为一种强大的相异性度量。然而，正是由于这种敏感性，微小的异常质量也可能导致两个相似分布间的$2$-Wasserstein距离显著增加。类似地，采样差异会导致$\mathbb{R}^2$上$n$个样本的经验$2$-Wasserstein距离以$n^{-1/4}$的速率收敛于真实距离，这明显慢于$1$-Wasserstein距离的$n^{-1/2}$收敛速率。我们引入了一个由参数$k \ge 0$控制的新距离族，称为$k$-RPW，其基于计算部分$2$-Wasserstein距离。我们证明：(1) $k$-RPW满足度量性质；(2) $k$-RPW对微小异常质量具有鲁棒性，同时保留了$2$-Wasserstein距离对细微几何差异的敏感性；(3) 当$k$为常数时，$\mathbb{R}^2$上$n$个样本的经验分布间的$k$-RPW距离以$n^{-1/3}$的速率收敛于真实距离，这快于$2$-Wasserstein距离的$n^{-1/4}$收敛速率。通过利用部分$p$-Wasserstein距离，我们将该距离推广至任意$p \in [1,\infty]$。通过适当设置参数$k$或$p$，我们的距离可退化为总变差距离、$p$-Wasserstein距离以及L\'evy-Prokhorov距离。实验表明，在噪声真实数据集上的图像检索任务中，我们的距离函数相比$1$-Wasserstein距离、$2$-Wasserstein距离和总变差距离实现了更高的准确率。