Learnable Compression Network with Transformer for Approximate Nearest Neighbor Search

Approximate Nearest neighbor search (ANNS) plays a crucial role in information retrieval, which has a wide range of application scenarios. Therefore, during past several years, a lot of fast ANNS approaches have been proposed. Among these approaches, graph-based methods are one of the most popular type, as they have shown attractive theoretical guarantees and low query latency. In this paper, we propose a learnable compression network with transformer (LCNT), which projects feature vectors from high dimensional space onto low dimensional space, while preserving neighbor relationship. The proposed model can be generalized to existing graph-based methods to accelerate the process of building indexing graph and further reduce query latency. Specifically, the proposed LCNT contains two major parts, projection part and harmonizing part. In the projection part, input vectors are projected into a sequence of subspaces via multi channel sparse projection network. In the harmonizing part, a modified Transformer network is employed to harmonize features in subspaces and combine them to get a new feature. To evaluate the effectiveness of the proposed model, we conduct experiments on two million-scale databases, GIST1M and Deep1M. Experimental results show that the proposed model can improve the speed of building indexing graph to 2-3 times its original speed without sacrificing accuracy significantly. The query latency is reduced by a factor of 1.3 to 2.0. In addition, the proposed model can also be combined with other popular quantization methods.

翻译：近距离近邻搜索(ANNS)在信息检索中发挥着关键作用,信息检索具有广泛的应用情景。因此,在过去几年中,提出了许多快速的ANNS方法。在这些方法中,基于图形的方法是最受欢迎的类型之一,因为它们显示了有吸引力的理论保障和低查询延迟度。在本文件中,我们建议使用变压器(LONT)建立一个可学习的压缩网络,在维护相邻关系的同时,将高维空间的矢量定位到低维空间。拟议的模型可以推广到现有的基于图表的方法,以加速构建索引图形的进程,并进一步降低查询时间。具体地说,拟议的LCNT包含两个主要部分,即投影部分和统一部分。在投影部分,输入矢量矢量通过多频道分散的投影网络被预测成一个子空间序列。在协调部分,采用修改的变压器网络来协调子空间的特性,并把它们组合起来,以获得一个新的特征。为了评估拟议模型的有效性,我们可以在两百万尺度的数据库中进行实验,即GIST1M和深1M, 和深1M, 调调调调调的读取速度。在实验结果中,可以大大地显示,将模型的精确的精确的精确度的精确度的精确度上,以降低速度推成速度推算。实验结果,可以大大地标数乘。。实验结果可以大大地显示,以降低其速度的精确度的精确度。