In this paper, we investigate a novel spatial dataset search paradigm over multiple spatial data sources, which enables users to conduct join and union searches seamlessly. Specifically, we define two search problems called Maximum Intersection Query (MIQ) and Maximum Coverage Query with a Connection constraint (MCQC). To address these problems, we propose a unified Multi-source Spatial Dataset Search (MSDS) framework. In MSDS, we design a multi-layer index to accelerate the MIQ and MCQC. In addition, we prove that the MCQC is NP-hard and design two greedy algorithms to solve the problem. To deal with the constant update of spatial datasets in each data source, we design a dynamic index updating strategy and optimize search algorithms to reduce communication costs and improve search efficiency. We evaluate the efficiency of MSDS on five real-world data sources, and the experimental results show that our framework is able to achieve a significant reduction in running time and communication cost.
翻译:本文研究了一种基于多空间数据源的创新性空间数据集搜索范式,该范式使用户能够无缝执行连接与并集搜索。具体而言,我们定义了两种搜索问题:最大交集查询(MIQ)和带连接约束的最大覆盖查询(MCQC)。为解决这些问题,我们提出了统一的多源空间数据集搜索框架(MSDS)。在MSDS中,我们设计了多层索引以加速MIQ与MCQC查询。此外,我们证明了MCQC是NP难问题,并设计了两种贪心算法加以求解。针对各数据源中空间数据集的持续更新问题,我们设计了动态索引更新策略,并优化了搜索算法以降低通信开销、提升搜索效率。我们在五个真实数据源上评估了MSDS的效能,实验结果表明,该框架能显著降低运行时间和通信成本。