Gaussian smoothed sliced Wasserstein distance has been recently introduced for comparing probability distributions, while preserving privacy on the data. It has been shown that it provides performances similar to its non-smoothed (non-private) counterpart. However, the computationaland statistical properties of such a metric have not yet been well-established. This work investigates the theoretical properties of this distance as well as those of generalized versions denoted as Gaussian-smoothed sliced divergences. We first show that smoothing and slicing preserve the metric property and the weak topology. To study the sample complexity of such divergences, we then introduce $\hat{\hat\mu}_{n}$ the double empirical distribution for the smoothed-projected $\mu$. The distribution $\hat{\hat\mu}_{n}$ is a result of a double sampling process: one from sampling according to the origin distribution $\mu$ and the second according to the convolution of the projection of $\mu$ on the unit sphere and the Gaussian smoothing. We particularly focus on the Gaussian smoothed sliced Wasserstein distance and prove that it converges with a rate $O(n^{-1/2})$. We also derive other properties, including continuity, of different divergences with respect to the smoothing parameter. We support our theoretical findings with empirical studies in the context of privacy-preserving domain adaptation.
翻译:高斯平滑切片Wasserstein距离最近被引入用于比较概率分布,同时保留数据隐私。已有研究表明,该距离的性能与其非平滑(非隐私)版本相似。然而,此类度量的计算和统计性质尚未得到充分确立。本文研究了该距离及其广义版本(称为高斯平滑切片散度)的理论性质。我们首先证明平滑和切片保持了度量性质和弱拓扑。为了研究此类散度的样本复杂度,我们随后引入了$\hat{\hat\mu}_{n}$,即平滑投影$\mu$的双重经验分布。分布$\hat{\hat\mu}_{n}$是双重采样过程的结果:一次根据原始分布$\mu$采样,另一次根据$\mu$在单位球面上的投影与高斯平滑的卷积采样。我们特别关注高斯平滑切片Wasserstein距离,并证明它以$O(n^{-1/2})$的速率收敛。我们还推导了其他性质,包括不同散度相对于平滑参数的连续性。我们在隐私保护领域自适应的背景下通过实证研究支持了我们的理论发现。