The Nearest Neighbor Search (NNS) problem asks to design a data structure that preprocesses an $n$-point dataset $X$ lying in a metric space $\mathcal{M}$, so that given a query point $q \in \mathcal{M}$, one can quickly return a point of $X$ minimizing the distance to $q$. The efficiency of such a data structure is evaluated primarily by the amount of space it uses and the time required to answer a query. We focus on the fast query-time regime, which is crucial for modern large-scale applications, where datasets are massive and queries must be processed online, and is often modeled by query time $\text{poly}(d \log n)$. Our main result is such a randomized data structure for NNS in $\ell_p$ spaces, $p>2$, that achieves $p^{O(1) + \log\log p}$ approximation with fast query time and $\text{poly}(dn)$ space. Our data structure improves, or is incomparable to, the state-of-the-art for the fast query-time regime from [Bartal and Gottlieb, TCS 2019] and [Krauthgamer, Petruschka and Sapir, FOCS 2025].
翻译:最近邻搜索(NNS)问题要求设计一种数据结构,它能预处理位于度量空间 $\mathcal{M}$ 中的 $n$ 点数据集 $X$,使得给定查询点 $q \in \mathcal{M}$ 时,能快速返回 $X$ 中距离 $q$ 最近的点。此类数据结构的效率主要取决于其空间开销和查询响应时间。我们聚焦于快速查询时间场景——这是现代大规模应用(数据集庞大且需在线处理查询)的关键需求,通常建模为查询时间 $\text{poly}(d \log n)$。我们的主要成果是针对 $\ell_p$ 空间($p>2$)中 NNS 问题的一种随机化数据结构,该结构在快速查询时间和 $\text{poly}(dn)$ 空间条件下实现了 $p^{O(1) + \log\log p}$ 近似比。我们的数据结构相较于 [Bartal and Gottlieb, TCS 2019] 和 [Krauthgamer, Petruschka and Sapir, FOCS 2025] 在快速查询时间场景下的最佳现有结果,或有所改进,或具有不可比性。