Embedding Probability Distributions into Low Dimensional $\ell_1$: Tree Ising Models via Truncated Metrics

Given an arbitrary set of high dimensional points in $\ell_1$, there are known negative results that preclude the possibility of always mapping them to a low dimensional $\ell_1$ space while preserving distances with small multiplicative distortion. This is in stark contrast with dimension reduction in Euclidean space ($\ell_2$) where such mappings are always possible. While the first non-trivial lower bounds for $\ell_1$ dimension reduction were established almost 20 years ago, there has been limited progress in understanding what sets of points in $\ell_1$ are conducive to a low-dimensional mapping. In this work, we study a new characterization of $\ell_1$ metrics that are conducive to dimension reduction in $\ell_1$. Our characterization focuses on metrics that are defined by the disagreement of binary variables over a probability distribution -- any $\ell_1$ metric can be represented in this form. We show that, for configurations of $n$ points in $\ell_1$ obtained from tree Ising models, we can reduce dimension to $\mathrm{polylog}(n)$ with constant distortion. In doing so, we develop technical tools for embedding truncated metrics which have been studied because of their applications in computer vision, and are objects of independent interest in metric geometry. Among other tools, we show how any $\ell_1$ metric can be truncated with $O(1)$ distortion and $O(\log(n))$ blowup in dimension.

翻译：给定ℓ₁空间中任意高维点集，已知存在负面结论：无法始终在保持较小乘法失真条件下将其映射到低维ℓ₁空间。这与欧氏空间（ℓ₂）中始终可行的降维形成鲜明对比。尽管ℓ₁降维的首批非平凡下界早在20年前就已确立，但关于哪些ℓ₁点集适合低维映射的理解仍进展有限。本文研究了有利于ℓ₁降维的ℓ₁度量的新特征。我们的特征刻画聚焦于由二元变量在概率分布上的不一致性所定义的度量——任何ℓ₁度量均可表示为该形式。研究表明，对于从树状伊辛模型获得的n个ℓ₁空间点的配置，我们可以在恒定失真条件下将维度降低至$\mathrm{polylog}(n)$。在此过程中，我们开发了截断度量嵌入的技术工具——这类度量因计算机视觉中的应用而受到研究，且本身是度量几何中具有独立价值的研究对象。除其他工具外，我们还证明了任何ℓ₁度量均可通过O(1)失真和O(log(n))维度膨胀实现截断。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

46+阅读 · 2020年5月11日