Lower Estimates for $L_1$-Distortion of Transportation Cost Spaces

Quantifying the degree of dissimilarity between two probability distributions on a finite metric space is a fundamental task in Computer Science and Computer Vision. A natural dissimilarity measure based on optimal transport is the Earth Mover's Distance (EMD). A key technique for analyzing this metric, pioneered by Charikar (2002) and Indyk and Thaper (2003), involves constructing low-distortion embeddings of EMD(X) into the Lebesgue space $L_1$. It became a key problem to investigate whether the upper bound of $O(\log n)$ can be improved for important classes of metric spaces known to admit low-distortion embeddings into $L_1$. In the context of Computer Vision, grid graphs, especially planar grids, are among the most fundamental. Indyk posed the related problem of estimating the $L_1$-distortion of the space of uniform distributions on $n$-point subsets of $R^2$. The Progress Report, last updated in August 2011, highlighted two key results: first, the work of Khot and Naor (2006) on Hamming cubes, which showed that the $L_1$-distortion for Hamming cubes meets the described above upper estimate, and second, the result of Naor and Schechtman (2007) for planar grids, which established that the $L_1$-distortion of for a planar $n$ by $n$ grid is $Ω(\sqrt{\log n})$. Our first result is the improvement of the lower bound on the $L_1$-distortion for grids to $Ω(\log n)$, matching the universal upper bound up to multiplicative constants. The key ingredient allowing us to obtain these sharp estimates is a new Sobolev-type inequality for scalar-valued functions on the grid graphs. Our method is also applicable to many recursive families of graphs, such as diamond and Laakso graphs. We obtain the sharp distortion estimates of $\log n$ in these cases as well.

翻译：量化有限度量空间上两个概率分布之间的差异程度是计算机科学与计算机视觉领域的一项基本任务。基于最优传输的一种自然差异度量是推土机距离（EMD）。分析该度量的关键技术由Charikar（2002）以及Indyk和Thaper（2003）开创，涉及将EMD(X)低失真嵌入到勒贝格空间$L_1$中。对于已知可低失真嵌入$L_1$的重要度量空间类，探究$O(\log n)$的上界能否改进已成为关键问题。在计算机视觉背景下，网格图（尤其是平面网格）是最基础的度量空间之一。Indyk提出了估算$R^2$上$n$点子集均匀分布空间的$L_1$失真这一相关问题。截至2011年8月的最新进展报告强调了两项关键成果：一是Khot和Naor（2006）关于汉明立方体的工作，表明汉明立方体的$L_1$失真达到了前述上界估计；二是Naor和Schechtman（2007）关于平面网格的结果，确立了$n \times n$平面网格的$L_1$失真为$Ω(\sqrt{\log n})$。我们的首要成果是将网格的$L_1$失真下界改进至$Ω(\log n)$，该结果在乘法常数范围内与普适上界相匹配。使我们能获得此精确估计的关键要素是网格图上标量函数的新型Sobolev型不等式。我们的方法同样适用于许多递归图族，如菱形图和Laakso图。在这些情形下我们也得到了$\log n$的精确失真估计。