Domain decomposition has been shown to be a computationally efficient distributed method for solving large scale entropic optimal transport problems. However, a naive implementation of the algorithm can freeze in the limit of very fine partition cells (i.e. it asymptotically becomes stationary and does not find the global minimizer), since information can only travel slowly between cells. In practice this can be avoided by a coarse-to-fine multiscale scheme. In this article we introduce flow updates as an alternative approach. Flow updates can be interpreted as a variant of the celebrated algorithm by Angenent, Haker, and Tannenbaum, and can be combined canonically with domain decomposition. We prove convergence to the global minimizer and provide a formal discussion of its continuity limit. We give a numerical comparison with naive and multiscale domain decomposition, and show that the flow updates prevent freezing in the regime of very many cells. While the multiscale scheme is observed to be faster than the hybrid approach in general, the latter could be a viable alternative in cases where a good initial coupling is available. Our numerical experiments are based on a novel GPU implementation of domain decomposition that we describe in the appendix.
翻译:域分解已被证明是求解大规模熵最优输运问题的一种计算高效的分布式方法。然而,在划分单元极其精细的极限情况下,该算法的朴素实现可能陷入停滞(即渐近地趋于静止而无法找到全局极小解),因为信息在单元间的传递速度缓慢。实践中,可通过从粗到细的多尺度方案避免此问题。本文提出流更新作为替代方法。流更新可被理解为Angenent、Haker和Tannenbaum经典算法的变体,并能与域分解进行规范结合。我们证明了该方法能收敛至全局极小解,并对其连续极限进行了形式化讨论。通过与朴素域分解及多尺度域分解的数值比较,我们证明流更新在单元数量极大的情况下能有效防止停滞现象。虽然多尺度方案在总体上被观测到比混合方法更快,但当存在良好初始耦合时,后者可能成为可行的替代方案。我们的数值实验基于一种在附录中描述的新型域分解GPU实现。