Gaussian smoothed sliced Wasserstein distance has been recently introduced for comparing probability distributions, while preserving privacy on the data. It has been shown that it provides performances similar to its non-smoothed (non-private) counterpart. However, the computationaland statistical properties of such a metric have not yet been well-established. This work investigates the theoretical properties of this distance as well as those of generalized versions denoted as Gaussian-smoothed sliced divergences. We first show that smoothing and slicing preserve the metric property and the weak topology. To study the sample complexity of such divergences, we then introduce $\hat{\hat\mu}_{n}$ the double empirical distribution for the smoothed-projected $\mu$. The distribution $\hat{\hat\mu}_{n}$ is a result of a double sampling process: one from sampling according to the origin distribution $\mu$ and the second according to the convolution of the projection of $\mu$ on the unit sphere and the Gaussian smoothing. We particularly focus on the Gaussian smoothed sliced Wasserstein distance and prove that it converges with a rate $O(n^{-1/2})$. We also derive other properties, including continuity, of different divergences with respect to the smoothing parameter. We support our theoretical findings with empirical studies in the context of privacy-preserving domain adaptation.
翻译:高斯平滑切片Wasserstein距离最近被引入用于比较概率分布,同时保护数据的隐私性。研究表明,该距离能提供与其非平滑(非隐私)版本相似的性能。然而,此类度量的计算与统计性质尚未得到充分确立。本文研究了该距离及其推广形式(称为高斯平滑切片散度)的理论性质。我们首先证明平滑与切片操作保持度量性质与弱拓扑。为了研究此类散度的样本复杂度,我们引入$\hat{\hat\mu}_{n}$,即平滑投影$\mu$的双重经验分布。分布$\hat{\hat\mu}_{n}$源于双重采样过程:第一次根据原始分布$\mu$采样,第二次根据$\mu$在单位球面上的投影与高斯平滑的卷积进行采样。我们特别关注高斯平滑切片Wasserstein距离,证明其收敛速度为$O(n^{-1/2})$。我们还推导了不同散度相对于平滑参数的连续性等其他性质。我们通过在隐私保护领域自适应中的实证研究支持理论发现。