Entropic optimal transport (EOT) presents an effective and computationally viable alternative to unregularized optimal transport (OT), offering diverse applications for large-scale data analysis. In this work, we derive novel statistical bounds for empirical plug-in estimators of the EOT cost and show that their statistical performance in the entropy regularization parameter $\epsilon$ and the sample size $n$ only depends on the simpler of the two probability measures. For instance, under sufficiently smooth costs this yields the parametric rate $n^{-1/2}$ with factor $\epsilon^{-d/2}$, where $d$ is the minimum dimension of the two population measures. This confirms that empirical EOT also adheres to the lower complexity adaptation principle, a hallmark feature only recently identified for unregularized OT. As a consequence of our theory, we show that the empirical entropic Gromov-Wasserstein distance and its unregularized version for measures on Euclidean spaces also obey this principle. Additionally, we comment on computational aspects and complement our findings with Monte Carlo simulations. Our techniques employ empirical process theory and rely on a dual formulation of EOT over a single function class. Crucial to our analysis is the observation that the entropic cost-transformation of a function class does not increase its uniform metric entropy by much.
翻译:熵最优输运(EOT)为未正则化的最优输运(OT)提供了一种有效且计算可行的替代方案,适用于大规模数据分析的多种应用。在本工作中,我们推导了EOT代价的经验插入估计量的新颖统计界,并证明其在熵正则化参数$\epsilon$和样本量$n$下的统计性能仅取决于两个概率测度中较简单的一个。例如,在充分光滑的代价函数下,这给出了参数速率$n^{-1/2}$,因子为$\epsilon^{-d/2}$,其中$d$是两个总体测度的最小维度。这证实了经验EOT也遵循降低复杂度的自适应原则,这一标志性特征直到最近才在未正则化OT中被识别。作为我们理论的推论,我们证明经验熵Gromov-Wasserstein距离及其在欧几里得空间上测度的未正则化版本也遵循这一原则。此外,我们讨论了计算方面的问题,并通过蒙特卡洛模拟补充了我们的发现。我们的技术采用了经验过程理论,并依赖于EOT在单一函数类上的对偶公式。我们分析的关键在于观察到函数类的熵代价变换不会显著增加其一致度量熵。