Entropic optimal transport (EOT) presents an effective and computationally viable alternative to unregularized optimal transport (OT), offering diverse applications for large-scale data analysis. In this work, we derive novel statistical bounds for empirical plug-in estimators of the EOT cost and show that their statistical performance in the entropy regularization parameter $\epsilon$ and the sample size $n$ only depends on the simpler of the two probability measures. For instance, under sufficiently smooth costs this yields the parametric rate $n^{-1/2}$ with factor $\epsilon^{-d/2}$, where $d$ is the minimum dimension of the two population measures. This confirms that empirical EOT also adheres to the lower complexity adaptation principle, a hallmark feature only recently identified for unregularized OT. As a consequence of our theory, we show that the empirical entropic Gromov-Wasserstein distance and its unregularized version for measures on Euclidean spaces also obey this principle. Additionally, we comment on computational aspects and complement our findings with Monte Carlo simulations. Our techniques employ empirical process theory and rely on a dual formulation of EOT over a single function class. Crucial to our analysis is the observation that the entropic cost-transformation of a function class does not increase its uniform metric entropy by much.
翻译:熵正则化最优传输(EOT)为无正则化最优传输(OT)提供了一种有效且计算可行的替代方案,为大规模数据分析提供了多样化的应用。本文推导了EOT成本的经验插件估计量的新型统计界限,并证明其在熵正则化参数$\epsilon$和样本量$n$方面的统计性能仅取决于两个概率测度中较简单的一个。例如,在足够光滑的成本函数下,该方法可得到以$\epsilon^{-d/2}$为系数的参数速率$n^{-1/2}$,其中$d$是两个总体测度的最小维度。这证实了经验EOT同样遵循复杂度下界适应性原则——这一特征直到最近才在无正则化OT中被识别。基于我们的理论结果,我们证明了经验熵正则化Gromov-Wasserstein距离及其在欧几里得空间测度上的无正则化版本也遵循这一原则。此外,我们讨论了计算层面的相关问题,并通过蒙特卡洛模拟对研究结果进行了补充验证。我们的技术采用了经验过程理论,并依赖于EOT在单一函数类上的对偶表述。分析的关键在于观察到:函数类的熵成本变换不会显著增加其一致度量熵。