For approximate nearest neighbor search, graph-based algorithms have shown to offer the best trade-off between accuracy and search time. We propose the Dynamic Exploration Graph (DEG) which significantly outperforms existing algorithms in terms of search and exploration efficiency by combining two new ideas: First, a single undirected even regular graph is incrementally built by partially replacing existing edges to integrate new vertices and to update old neighborhoods at the same time. Secondly, an edge optimization algorithm is used to continuously improve the quality of the graph. Combining this ongoing refinement with the graph construction process leads to a well-organized graph structure at all times, resulting in: (1) increased search efficiency, (2) predictable index size, (3) guaranteed connectivity and therefore reachability of all vertices, and (4) a dynamic graph structure. In addition we investigate how well existing graph-based search systems can handle indexed queries where the seed vertex of a search is the query itself. Such exploration tasks, despite their good starting point, are not necessarily easy. High efficiency in approximate nearest neighbor search (ANNS) does not automatically imply good performance in exploratory search. Extensive experiments show that our new Dynamic Exploration Graph outperforms existing algorithms significantly for indexed and unindexed queries.
翻译:对于近似最近邻搜索,基于图的算法在准确率和搜索时间之间展现出最佳平衡。我们提出动态探索图(DEG),该算法通过结合两种新思路,在搜索和探索效率上显著优于现有算法:首先,通过部分替换现有边来增量构建单一无向偶正则图,同时整合新顶点并更新旧邻域;其次,采用边优化算法持续提升图的质量。将该持续精化过程与图构建相结合,始终保持良好组织的图结构,从而实现:(1)搜索效率提升,(2)可预测的索引规模,(3)确保连通性从而保证所有顶点的可达性,以及(4)动态图结构。此外,我们研究了现有基于图的搜索系统在处理索引查询(搜索种子顶点即为查询本身)时的性能。此类探索任务尽管具有良好起点,但并非必然简单。近似最近邻搜索(ANNS)的高效率并不自动意味着探索性搜索的优异性能。大量实验表明,我们的新型动态探索图在索引查询与非索引查询上均显著优于现有算法。