Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To alleviate the computational burden, we propose a novel importance sparsification method, called Spar-Sink, to efficiently approximate entropy-regularized OT and UOT solutions. Specifically, our method employs natural upper bounds for unknown optimal transport plans to establish effective sampling probabilities, and constructs a sparse kernel matrix to accelerate Sinkhorn iterations, reducing the computational cost of each iteration from $O(n^2)$ to $\widetilde{O}(n)$ for a sample of size $n$. Theoretically, we show the proposed estimators for the regularized OT and UOT problems are consistent under mild regularity conditions. Experiments on various synthetic data demonstrate Spar-Sink outperforms mainstream competitors in terms of both estimation error and speed. A real-world echocardiogram data analysis shows Spar-Sink can effectively estimate and visualize cardiac cycles, from which one can identify heart failure and arrhythmia. To evaluate the numerical accuracy of cardiac cycle prediction, we consider the task of predicting the end-systole time point using the end-diastole one. Results show Spar-Sink performs as well as the classical Sinkhorn algorithm, requiring significantly less computational time.
翻译:Sinkhorn算法已被广泛应用于近似求解最优传输(OT)和非平衡最优传输(UOT)问题。然而,由于计算复杂度较高,其实际应用受到限制。为缓解计算负担,我们提出一种新颖的重要性稀疏化方法——Spar-Sink,用于高效近似熵正则化OT和UOT解。具体而言,该方法利用未知最优传输计划的自然上界建立有效采样概率,并构造稀疏核矩阵以加速Sinkhorn迭代,从而将每个迭代的计算成本从样本规模n的O(n²)降低至Õ(n)。理论上,我们证明在温和正则条件下,所提出的正则化OT和UOT问题估计量具有一致性。在多种合成数据上的实验表明,Spar-Sink在估计误差和速度两方面均优于主流对比方法。真实超声心动图数据分析显示,Spar-Sink可有效估计并可视化心动周期,据此可识别心力衰竭和心律失常。为评估心动周期预测的数值精度,我们考虑利用舒张末期时间点预测收缩末期时间点的任务。结果表明,Spar-Sink在显著减少计算时间的同时,性能与传统Sinkhorn算法相当。