We present a novel search optimization solution for approximate nearest neighbor (ANN) search on resource-constrained edge devices. Traditional ANN approaches fall short in meeting the specific demands of real-world scenarios, e.g., skewed query likelihood distribution and search on large-scale indices with a low latency and small footprint. To address these limitations, we introduce two key components: a Query Likelihood Boosted Tree (QLBT) to optimize average search latency for frequently used small datasets, and a two-level approximate search algorithm to enable efficient retrieval with large datasets on edge devices. We perform thorough evaluation on simulated and real data and demonstrate QLBT can significantly reduce latency by 15% on real data and our two-level search algorithm successfully achieve deployable accuracy and latency on a 10 million dataset for edge devices. In addition, we provide a comprehensive protocol for configuring and optimizing on-device search algorithm through extensive empirical studies.
翻译:我们提出了一种针对资源受限边缘设备上近似最近邻搜索的新型搜索优化方案。传统ANN方法难以满足实际场景中的特定需求,例如倾斜的查询似然分布、在低延迟和小存储占用下对大规模索引进行搜索。为解决这些局限,我们引入两个关键组件:查询似然提升树(QLBT)用于优化常用小型数据集的平均搜索延迟,以及一种两级近似搜索算法,使边缘设备能够在大数据集上实现高效检索。我们在模拟数据和真实数据上进行了全面评估,结果表明QLBT在真实数据上可将延迟显著降低15%,而我们的两级搜索算法在千万级数据集上成功实现了边缘设备可部署的精度与延迟。此外,通过大量实证研究,我们提供了一套用于配置和优化设备端搜索算法的综合协议。