LOCORE：基于长上下文序列建模的图像重排序 (LOCORE: Image Re-ranking with Long-Context Sequence Modeling)

We introduce LOCORE, Long-Context Re-ranker, a model that takes as input local descriptors corresponding to an image query and a list of gallery images and outputs similarity scores between the query and each gallery image. This model is used for image retrieval, where typically a first ranking is performed with an efficient similarity measure, and then a shortlist of top-ranked images is re-ranked based on a more fine-grained similarity measure. Compared to existing methods that perform pair-wise similarity estimation with local descriptors or list-wise re-ranking with global descriptors, LOCORE is the first method to perform list-wise re-ranking with local descriptors. To achieve this, we leverage efficient long-context sequence models to effectively capture the dependencies between query and gallery images at the local-descriptor level. During testing, we process long shortlists with a sliding window strategy that is tailored to overcome the context size limitations of sequence models. Our approach achieves superior performance compared with other re-rankers on established image retrieval benchmarks of landmarks (ROxf and RPar), products (SOP), fashion items (In-Shop), and bird species (CUB-200) while having comparable latency to the pair-wise local descriptor re-rankers.

翻译：本文提出LOCORE（长上下文重排序模型），该模型以图像查询及其对应图库图像列表的局部描述符作为输入，输出查询图像与各图库图像间的相似度分数。该模型主要用于图像检索任务：通常先通过高效相似度度量进行初步排序，再基于更精细的相似度度量对排名靠前的候选图像短列表进行重排序。相较于现有基于局部描述符的成对相似度估计方法或基于全局描述符的列表式重排序方法，LOCORE是首个基于局部描述符实现列表式重排序的方法。为实现这一目标，我们利用高效的长上下文序列模型，在局部描述符层级有效捕捉查询图像与图库图像间的依赖关系。在测试阶段，我们采用滑动窗口策略处理长候选列表，该策略专门用于克服序列模型的上下文长度限制。在经典图像检索基准数据集（地标数据集ROxf和RPar、商品数据集SOP、时尚单品数据集In-Shop以及鸟类物种数据集CUB-200）上，本方法相较于其他重排序模型展现出更优性能，同时其延迟时间与基于局部描述符的成对重排序模型相当。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日