Near-Optimal Dimension Reduction for Facility Location

Oblivious dimension reduction, \`{a} la the Johnson-Lindenstrauss (JL) Lemma, is a fundamental approach for processing high-dimensional data. We study this approach for Uniform Facility Location (UFL) on a Euclidean input $X\subset\mathbb{R}^d$, where facilities can lie in the ambient space (not restricted to $X$). Our main result is that target dimension $m=\tilde{O}(\epsilon^{-2}\mathrm{ddim})$ suffices to $(1+\epsilon)$-approximate the optimal value of UFL on inputs whose doubling dimension is bounded by $\mathrm{ddim}$. It significantly improves over previous results, that could only achieve $O(1)$-approximation [Narayanan, Silwal, Indyk, and Zamir, ICML 2021] or dimension $m=O(\epsilon^{-2}\log n)$ for $n=|X|$, which follows from [Makarychev, Makarychev, and Razenshteyn, STOC 2019]. Our oblivious dimension reduction has immediate implications to streaming and offline algorithms, by employing known algorithms for low dimension. In dynamic geometric streams, it implies a $(1+\epsilon)$-approximation algorithm that uses $O(\epsilon^{-1}\log n)^{\tilde{O}(\mathrm{ddim}/\epsilon^{2})}$ bits of space, which is the first streaming algorithm for UFL to utilize the doubling dimension. In the offline setting, it implies a $(1+\epsilon)$-approximation algorithm, which we further refine to run in time $( (1/\epsilon)^{\tilde{O}(\mathrm{ddim})} d + 2^{(1/\epsilon)^{\tilde{O}(\mathrm{ddim})}}) \cdot \tilde{O}(n) $. Prior work has a similar running time but requires some restriction on the facilities [Cohen-Addad, Feldmann and Saulpic, JACM 2021]. Our main technical contribution is a fast procedure to decompose an input $X$ into several $k$-median instances for small $k$. This decomposition is inspired by, but has several significant differences from [Czumaj, Lammersen, Monemizadeh and Sohler, SODA 2013], and is key to both our dimension reduction and our PTAS.

翻译：类似于Johnson-Lindenstrauss（JL）引理的遗忘降维是处理高维数据的基本方法。我们针对欧几里得输入$X\subset\mathbb{R}^d$上的均匀设施选址问题（UFL）研究了这一方法，其中设施可以位于环境空间中（不限于$X$）。我们的主要结果表明，对于倍增维度有界于$\mathrm{ddim}$的输入，目标维度$m=\tilde{O}(\epsilon^{-2}\mathrm{ddim})$足以$(1+\epsilon)$近似UFL的最优值。这显著改进了先前的结果，先前结果仅能实现$O(1)$近似[Narayanan, Silwal, Indyk, and Zamir, ICML 2021]或维度$m=O(\epsilon^{-2}\log n)$（其中$n=|X|$），后者源自[Makarychev, Makarychev, and Razenshteyn, STOC 2019]。我们的遗忘降维通过采用已知的低维算法，对流式计算和离线算法具有直接意义。在动态几何流中，它意味着一个使用$O(\epsilon^{-1}\log n)^{\tilde{O}(\mathrm{ddim}/\epsilon^{2})}$比特空间的$(1+\epsilon)$近似算法，这是首个利用倍增维度的UFL流式算法。在离线设置中，它意味着一个$(1+\epsilon)$近似算法，我们进一步将其优化至运行时间为$( (1/\epsilon)^{\tilde{O}(\mathrm{ddim})} d + 2^{(1/\epsilon)^{\tilde{O}(\mathrm{ddim})}}) \cdot \tilde{O}(n)$。先前工作具有类似的运行时间，但需要对设施施加某些限制[Cohen-Addad, Feldmann and Saulpic, JACM 2021]。我们的主要技术贡献是一种将输入$X$快速分解为多个小$k$的$k$-中值实例的过程。该分解受[Czumaj, Lammersen, Monemizadeh and Sohler, SODA 2013]的启发，但与之存在若干显著差异，并且是我们降维和PTAS的关键。

相关内容

佛罗里达大学（UFL）

关注 0

佛罗里达州立大学（Florida State University）创校于1851年，为一所公立研究型的高等学府，是美国最具活力的高等教育机构之一，因拥有国际一流的教学师资和尖端的科学研究而受到广泛关注，学校每年科研经费高达2亿美元。佛罗里达州立大学中许多的项目都保持在国际公立大学排名前25名中，包括：物理、化学、海洋图像学、统计学、生态与进化生物、气象学、政治科学、心理学、社会学、犯罪学、信息学、创新写作、公共政治、商业和法律等。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日