Entropic optimal transport (EOT) presents an effective and computationally viable alternative to unregularized optimal transport (OT), offering diverse applications for large-scale data analysis. In this work, we derive novel statistical bounds for empirical plug-in estimators of the EOT cost and show that their statistical performance in the entropy regularization parameter $\epsilon$ and the sample size $n$ only depends on the simpler of the two probability measures. For instance, under sufficiently smooth costs this yields the parametric rate $n^{-1/2}$ with factor $\epsilon^{-d/2}$, where $d$ is the minimum dimension of the two population measures. This confirms that empirical EOT also adheres to the lower complexity adaptation principle, a hallmark feature only recently identified for unregularized OT. As a consequence of our theory, we show that the empirical entropic Gromov-Wasserstein distance and its unregularized version for measures on Euclidean spaces also obey this principle. Additionally, we comment on computational aspects and complement our findings with Monte Carlo simulations. Our techniques employ empirical process theory and rely on a dual formulation of EOT over a single function class. Crucial to our analysis is the observation that the entropic cost-transformation of a function class does not increase its uniform metric entropy by much.
翻译:熵正则化最优传输(EOT)为无正则化最优传输(OT)提供了一种有效且计算可行的替代方案,为大规模数据分析提供了多样化的应用。本文推导了EOT成本的经验插件估计器的新统计界限,并证明其在熵正则化参数$\epsilon$和样本量$n$下的统计性能仅取决于两个概率测度中较简单的一个。例如,在足够光滑的成本函数下,这会产生以$\epsilon^{-d/2}$为因子的参数速率$n^{-1/2}$,其中$d$是两个总体测度的最小维度。这证实了经验EOT同样遵循复杂度下界适应性原则——这是最近才在无正则化OT中发现的一个标志性特征。作为我们理论的推论,我们证明了欧几里得空间上测度的经验熵Gromov-Wasserstein距离及其无正则化版本也遵循这一原则。此外,我们讨论了计算方面的考量,并通过蒙特卡洛模拟对研究结果进行了补充。我们的技术采用了经验过程理论,并依赖于EOT在单一函数类上的对偶形式。分析的关键在于观察到:函数类的熵成本变换不会显著增加其一致度量熵。