Optimal transport and the Wasserstein distance $\mathcal{W}_p$ have recently seen a number of applications in the fields of statistics, machine learning, data science, and the physical sciences. These applications are however severely restricted by the curse of dimensionality, meaning that the number of data points needed to estimate these problems accurately increases exponentially in the dimension. To alleviate this problem, a number of variants of $\mathcal{W}_p$ have been introduced. We focus here on one of these variants, namely the max-sliced Wasserstein metric $\overline{\mathcal{W}}_p$. This metric reduces the high-dimensional minimization problem given by $\mathcal{W}_p$ to a maximum of one-dimensional measurements in an effort to overcome the curse of dimensionality. In this note we derive concentration results and upper bounds on the expectation of $\overline{\mathcal{W}}_p$ between the true and empirical measure on unbounded reproducing kernel Hilbert spaces. We show that, under quite generic assumptions, probability measures concentrate uniformly fast in one-dimensional subspaces, at (nearly) parametric rates. Our results rely on an improvement of currently known bounds for $\overline{\mathcal{W}}_p$ in the finite-dimensional case.
翻译:最优传输与Wasserstein距离$\mathcal{W}_p$近年来在统计学、机器学习、数据科学和物理科学领域获得了诸多应用。然而,这些应用受到维度灾难的严重限制,这意味着准确估计这些问题所需的数据点数量随维度呈指数级增长。为缓解此问题,学界已提出多种$\mathcal{W}_p$的变体。本文重点研究其中一种变体——最大切片Wasserstein度量$\overline{\mathcal{W}}_p$。该度量将$\mathcal{W}_p$给出的高维最小化问题简化为一系列一维度量的最大值计算,以克服维度灾难。本文推导了无界再生核希尔伯特空间中真实测度与经验测度之间$\overline{\mathcal{W}}_p$的集中性结果与期望上界。我们证明,在相当一般的假设下,概率测度在一维子空间中能以(接近)参数速率实现快速均匀集中。我们的结果依赖于对有限维情形下现有$\overline{\mathcal{W}}_p$界值的改进。