Efficient spatial indexing is crucial for processing large-scale spatial data. Traditional spatial indexes, such as STR-Tree and Quad-Tree, organize spatial objects based on coarse approximations, such as their minimum bounding rectangles (MBRs). However, this coarse representation is inadequate for complex spatial objects (e.g., district boundaries and trajectories), limiting filtering accuracy and query performance of spatial indexes. To address these limitations, we propose GP-Tree, a fine-grained spatial index that organizes approximated grid cells of spatial objects into a prefix tree structure. GP-Tree enhances filtering ability by replacing coarse MBRs with fine-grained cell-based approximations of spatial objects. The prefix tree structure optimizes data organization and query efficiency by leveraging the shared prefixes in the hierarchical grid cell encodings between parent and child cells. Additionally, we introduce optimization strategies, including tree pruning and node optimization, to reduce search paths and memory consumption, further enhancing GP-Tree's performance. Finally, we implement a variety of spatial query operations on GP-Tree, including range queries, distance queries, and k-nearest neighbor queries. Extensive experiments on real-world datasets demonstrate that GP-Tree significantly outperforms traditional spatial indexes, achieving up to an order-of-magnitude improvement in query efficiency.
翻译:高效的空间索引对于处理大规模空间数据至关重要。传统的空间索引(如STR-Tree和Quad-Tree)基于粗略近似(如对象的最小边界矩形(MBR))来组织空间对象。然而,这种粗略表示对于复杂空间对象(如行政区边界和轨迹)并不充分,限制了空间索引的过滤精度和查询性能。为解决这些局限性,本文提出GP-Tree,一种细粒度空间索引,它将空间对象的近似网格单元组织到前缀树结构中。GP-Tree通过使用基于细粒度单元的近似替代粗略的MBR,增强了过滤能力。前缀树结构利用父子单元间层次化网格单元编码的共享前缀,优化了数据组织和查询效率。此外,我们引入了优化策略,包括树剪枝和节点优化,以减少搜索路径和内存消耗,进一步提升GP-Tree的性能。最后,我们在GP-Tree上实现了多种空间查询操作,包括范围查询、距离查询和k近邻查询。在真实数据集上的大量实验表明,GP-Tree显著优于传统空间索引,查询效率提升可达一个数量级。