Spatial query and analysis results are often directly applied to decision-making processes such as facility location, proximity resource discovery, accessibility analysis, and risk assessment. Therefore, the efficiency of underlying spatial data access directly impacts the response speed of spatial decision analysis. Existing distributed spatial analysis systems (e.g., Simba, Sedona) already have relatively mature execution frameworks. However, they incur substantial overhead in local index construction and query refinement, especially in read-intensive scenarios. Recent studies have shown that learned indices exhibit considerable retrieval potential in single-machine settings, yet how to integrate them into distributed spatial analysis systems with low modification costs remains unaddressed. In this article, we present LiLIS, a Lightweight distributed Learned Index prototype for Spatial decision analysis. Without modifying existing execution engines, LiLIS integrates machine-learned search strategies with spatial-aware partitioning in a distributed framework, and efficiently supports common spatial queries such as point queries, range queries, $k$-nearest neighbor ($k$NN) queries, and spatial joins. Extensive experiments on both real-world and synthetic datasets demonstrate that LiLIS achieves lower latency across various query types and reduces index construction overhead compared with baseline approaches. These results indicate its potential for improving the responsiveness of read-intensive spatial decision-support workflows.
翻译:空间查询与分析结果常直接应用于设施选址、邻近资源发现、可达性分析及风险评估等决策过程。因此,底层空间数据访问效率直接影响空间决策分析的响应速度。现有分布式空间分析系统(如Simba、Sedona)已具备较成熟的执行框架,但在读密集型场景下,其本地索引构建与查询精炼过程存在显著开销。近年研究表明,学习索引在单机环境下展现出可观的检索潜力,但如何以较低的修改代价将其集成到分布式空间分析系统中仍未得到解决。本文提出LiLIS——一种面向空间决策分析的轻量级分布式学习索引原型。LiLIS无需修改现有执行引擎,即可在分布式框架中融合机器学习搜索策略与空间感知分区技术,并高效支持点查询、范围查询、$k$近邻查询及空间连接等常见空间查询。基于真实数据集与合成数据集的充分实验表明,与基线方法相比,LiLIS在多种查询类型上实现了更低的延迟,并降低了索引构建开销。这些结果体现了其提升读密集型空间决策支持工作流响应能力的潜力。