We consider the $(1+\varepsilon)$-Approximate Nearest Neighbour (ANN) Problem for polygonal curves in $d$-dimensional space under the Fr\'echet distance and ask to what extent known data structures for doubling spaces can be applied to this problem. Initially, this approach does not seem viable, since the doubling dimension of the target space is known to be unbounded -- even for well-behaved polygonal curves of constant complexity in one dimension. In order to overcome this, we identify a subspace of curves which has bounded doubling dimension and small Gromov-Hausdorff distance to the target space. We then apply state-of-the-art techniques for doubling spaces and show how to obtain a data structure for the $(1+\varepsilon)$-ANN problem for any set of parametrized polygonal curves. The expected preprocessing time needed to construct the data-structure is $F(d,k,S,\varepsilon)n\log n$ and the space used is $F(d,k,S,\varepsilon)n$, with a query time of $F(d,k,S,\varepsilon)\log n + F(d,k,S,\varepsilon)^{-\log(\varepsilon)}$, where $F(d,k,S,\varepsilon)=O\left(2^{O(d)}k\Phi(S)\varepsilon^{-1}\right)^k$ and $\Phi(S)$ denotes the spread of the set of vertices and edges of the curves in $S$. We extend these results to the realistic class of $c$-packed curves and show improved bounds for small values of $c$.
翻译:我们考虑在 Fréchet 距离下,$d$ 维空间中多边形曲线的 $(1+\varepsilon)$-近似最近邻(ANN)问题,并探究已知的倍增空间数据结构在何种程度上可应用于此问题。初看之下,该途径似乎不可行,因为目标空间的倍增维已知是无界的——即使对于一维空间中复杂度恒定的良性多边形曲线也是如此。为克服这一困难,我们识别出一个曲线子空间,该子空间具有有界倍增维且与目标空间的 Gromov-Hausdorff 距离很小。然后,我们应用针对倍增空间的最新技术,展示了如何为任意参数化多边形曲线集合构建 $(1+\varepsilon)$-ANN 问题的数据结构。构建该数据结构的期望预处理时间为 $F(d,k,S,\varepsilon)n\log n$,占用空间为 $F(d,k,S,\varepsilon)n$,查询时间为 $F(d,k,S,\varepsilon)\log n + F(d,k,S,\varepsilon)^{-\log(\varepsilon)}$,其中 $F(d,k,S,\varepsilon)=O\left(2^{O(d)}k\Phi(S)\varepsilon^{-1}\right)^k$,$\Phi(S)$ 表示 $S$ 中曲线顶点和边集合的分散度。我们将这些结果推广到更实际的 $c$-致密曲线类,并展示了针对较小 $c$ 值的改进上界。