Image-Text Retrieval (ITR) is essentially a ranking problem. Given a query caption, the goal is to rank candidate images by relevance, from large to small. The current ITR datasets are constructed in a pairwise manner. Image-text pairs are annotated as positive or negative. Correspondingly, ITR models mainly use pairwise losses, such as triplet loss, to learn to rank. Pairwise-based ITR increases positive pair similarity while decreasing negative pair similarity indiscriminately. However, the relevance between dissimilar negative pairs is different. Pairwise annotations cannot reflect this difference in relevance. In the current datasets, pairwise annotations miss many correlations. There are many potential positive pairs among the pairs labeled as negative. Pairwise-based ITR can only rank positive samples before negative samples, but cannot rank negative samples by relevance. In this paper, we integrate listwise ranking into conventional pairwise-based ITR. Listwise ranking optimizes the entire ranking list based on relevance scores. Specifically, we first propose a Relevance Score Calculation (RSC) module to calculate the relevance score of the entire ranked list. Then we choose the ranking metric, Normalized Discounted Cumulative Gain (NDCG), as the optimization objective. We transform the non-differentiable NDCG into a differentiable listwise loss, named Smooth-NDCG (S-NDCG). Our listwise ranking approach can be plug-and-play integrated into current pairwise-based ITR models. Experiments on ITR benchmarks show that integrating listwise ranking can improve the performance of current ITR models and provide more user-friendly retrieval results. The code is available at https://github.com/AAA-Zheng/Listwise_ITR.
翻译:图像-文本检索(ITR)本质上是一个排序问题:给定一个查询描述,目标是根据相关性从大到小对候选图像进行排序。当前ITR数据集采用成对方式构建,图像-文本对被标注为正相关或负相关。相应地,ITR模型主要使用三元组损失等成对损失函数来学习排序。基于成对比较的ITR会无差别地提升正样本对的相似度,同时降低负样本对的相似度。然而,不同负样本对之间的相关性存在差异,成对标注无法反映这种相关性差异。在当前数据集中,成对标注遗漏了许多关联信息,被标注为负样本的众多对中实际存在大量潜在正样本对。基于成对比较的ITR只能将正样本排在负样本之前,却无法依据相关性对负样本进行排序。本文提出将列表级排序融入传统基于成对比较的ITR框架。列表级排序基于相关性分数优化整个排序列表。具体而言,我们首先提出相关性分数计算模块(RSC)以获取整个排序列表的相关性分数,然后选择归一化折损累计增益(NDCG)作为优化目标,将不可微的NDCG转化为可微的列表级损失函数Smooth-NDCG(S-NDCG)。我们的列表级排序方法可即插即用地集成到现有基于成对比较的ITR模型中。在ITR基准数据集上的实验表明,融入列表级排序能提升现有ITR模型的性能,并带来更符合用户需求的检索结果。代码已开源在https://github.com/AAA-Zheng/Listwise_ITR。