Optimal transport is a fundamental topic that has attracted a great amount of attention from the optimization community in the past decades. In this paper, we consider an interesting discrete dynamic optimal transport problem: can we efficiently update the optimal transport plan when the weights or the locations of the data points change? This problem is naturally motivated by several applications in machine learning. For example, we often need to compute the optimal transport cost between two different data sets; if some changes happen to a few data points, should we re-compute the high complexity cost function or update the cost by some efficient dynamic data structure? We are aware that several dynamic maximum flow algorithms have been proposed before, however, the research on dynamic minimum cost flow problem is still quite limited, to the best of our knowledge. We propose a novel 2D Skip Orthogonal List together with some dynamic tree techniques. Although our algorithm is based on the conventional simplex method, it can efficiently find the variable to pivot within expected $O(1)$ time, and complete each pivoting operation within expected $O(|V|)$ time where $V$ is the set of all supply and demand nodes. Since dynamic modifications typically do not introduce significant changes, our algorithm requires only a few simplex iterations in practice. So our algorithm is more efficient than re-computing the optimal transport cost that needs at least one traversal over all $|E| = O(|V|^2)$ variables, where $|E|$ denotes the number of edges in the network. Our experiments demonstrate that our algorithm significantly outperforms existing algorithms in the dynamic scenarios.
翻译:最优传输是过去几十年中吸引了优化领域大量关注的基础性课题。在本文中,我们考虑一个有趣的离散动态最优传输问题:当数据点的权重或位置发生变化时,能否高效更新最优传输方案?这一问题自然源于机器学习中的若干应用。例如,我们常需计算两个不同数据集之间的最优传输代价;若少量数据点发生变更,是应重新计算高复杂度的代价函数,还是通过某种高效的动态数据结构来更新代价?据我们所知,此前已有多种动态最大流算法被提出,但针对动态最小代价流问题的研究仍相当有限。我们提出了一种新颖的二维跳跃正交表,并结合了多种动态树技术。尽管我们的算法基于传统单纯形法,它能在期望的$O(1)$时间内高效找到枢轴变量,并在期望的$O(|V|)$时间内完成每次枢轴操作,其中$V$是所有供给与需求节点的集合。由于动态修改通常不会引入显著变化,我们的算法在实践中仅需少量单纯形迭代。因此,相比至少需遍历所有$|E| = O(|V|^2)$个变量($|E|$表示网络中的边数)才能重新计算最优传输代价的方法,我们的算法更为高效。实验表明,在动态场景中,我们的算法显著优于现有算法。