Sliced optimal transport (SOT), or sliced Wasserstein (SW) distance, is widely recognized for its statistical and computational scalability. In this work, we further enhance computational scalability by proposing the first method for estimating SW from sample streams, called \emph{streaming sliced Wasserstein} (Stream-SW). To define Stream-SW, we first introduce a streaming estimator of the one-dimensional Wasserstein distance (1DW). Since the 1DW has a closed-form expression, given by the absolute difference between the quantile functions of the compared distributions, we leverage quantile approximation techniques for sample streams to define a streaming 1DW estimator. By applying the streaming 1DW to all projections, we obtain Stream-SW. The key advantage of Stream-SW is its low memory complexity while providing theoretical guarantees on the approximation error. We demonstrate that Stream-SW achieves a more accurate approximation of SW than random subsampling, with lower memory consumption, when comparing Gaussian distributions and mixtures of Gaussians from streaming samples. Additionally, we conduct experiments on point cloud classification, point cloud gradient flows, and streaming change point detection to further highlight the favorable performance of the proposed Stream-SW
翻译:切片最优传输(SOT),或称切片Wasserstein(SW)距离,因其统计与计算上的可扩展性而广受认可。在本工作中,我们通过提出首个从样本流中估计SW的方法,进一步提升了计算可扩展性,该方法被称为流式切片Wasserstein(Stream-SW)。为定义Stream-SW,我们首先引入一维Wasserstein距离(1DW)的流式估计器。由于1DW具有闭式表达式,即由待比较分布的分位数函数之差的绝对值给出,我们利用针对样本流的分位数近似技术来定义流式1DW估计器。通过将流式1DW应用于所有投影方向,我们得到了Stream-SW。Stream-SW的关键优势在于其低内存复杂度,同时为近似误差提供了理论保证。我们证明,在比较来自流式样本的高斯分布及高斯混合分布时,Stream-SW比随机子采样方法能以更低的内存消耗实现更准确的SW近似。此外,我们在点云分类、点云梯度流以及流式变点检测等任务上进行了实验,进一步凸显了所提Stream-SW方法的优越性能。