We study the unbalanced optimal transport (UOT) problem, where the marginal constraints are enforced using Maximum Mean Discrepancy (MMD) regularization. Our work is motivated by the observation that the literature on UOT is focused on regularization based on $\phi$-divergence (e.g., KL divergence). Despite the popularity of MMD, its role as a regularizer in the context of UOT seems less understood. We begin by deriving the dual of MMD-regularized UOT (MMD-UOT), which helps us prove other useful properties. One interesting outcome of this duality result is that MMD-UOT induces novel metrics, which not only lift the ground metric like the Wasserstein but are also efficient to estimate like the MMD. Further, we present finite-dimensional convex programs for estimating MMD-UOT and the corresponding barycenter solely based on the samples from the measures being transported. Under mild conditions, we prove that our convex-program-based estimators are consistent and the estimation error decays at a rate $\mathcal{O}\left(m^{-\frac{1}{2}}\right)$, where $m$ is the number of samples. As far as we know, such error bounds that are free from the curse of dimensionality are not known for $\phi$-divergence regularized UOT. Finally, we discuss how the proposed convex programs can be solved efficiently using accelerated projected gradient descent. Our experiments show that MMD-UOT consistently outperforms popular baselines, including KL-regularized UOT and MMD, in diverse machine learning applications.
翻译:我们研究了非平衡最优输运(UOT)问题,其中边际约束通过最大均值差异(MMD)正则化来施加。我们的工作源于以下观察:UOT领域的文献主要关注基于φ-散度(如KL散度)的正则化。尽管MMD广受欢迎,但其在UOT背景下作为正则化项的作用似乎尚未被充分理解。我们首先推导了MMD正则化UOT(MMD-UOT)的对偶形式,这有助于证明其他有用性质。这一对偶结果的一个有趣结论是,MMD-UOT诱导了新型度量,这些度量不仅像Wasserstein距离那样提升基础度量,而且像MMD一样易于估计。此外,我们提出了仅基于被输运测度样本估计MMD-UOT及其重心的有限维凸规划。在温和条件下,我们证明了基于凸规划的估计量具有一致性,且估计误差以速率$\mathcal{O}\left(m^{-\frac{1}{2}}\right)$衰减,其中$m$为样本数量。据我们所知,这种摆脱维数灾难的误差界在φ-散度正则化UOT中尚未被报道。最后,我们讨论了如何利用加速投影梯度下降高效求解所提出的凸规划。实验表明,在多种机器学习应用中,MMD-UOT始终优于KL正则化UOT和MMD等主流基线方法。