In image retrieval, standard evaluation metrics rely on score ranking, \eg average precision (AP), recall at k (R@k), normalized discounted cumulative gain (NDCG). In this work we introduce a general framework for robust and decomposable rank losses optimization. It addresses two major challenges for end-to-end training of deep neural networks with rank losses: non-differentiability and non-decomposability. Firstly we propose a general surrogate for ranking operator, SupRank, that is amenable to stochastic gradient descent. It provides an upperbound for rank losses and ensures robust training. Secondly, we use a simple yet effective loss function to reduce the decomposability gap between the averaged batch approximation of ranking losses and their values on the whole training set. We apply our framework to two standard metrics for image retrieval: AP and R@k. Additionally we apply our framework to hierarchical image retrieval. We introduce an extension of AP, the hierarchical average precision $\mathcal{H}$-AP, and optimize it as well as the NDCG. Finally we create the first hierarchical landmarks retrieval dataset. We use a semi-automatic pipeline to create hierarchical labels, extending the large scale Google Landmarks v2 dataset. The hierarchical dataset is publicly available at https://github.com/cvdfoundation/google-landmark. Code will be released at https://github.com/elias-ramzi/SupRank.
翻译:在图像检索中,标准评估指标依赖于分数排序,例如平均精度(AP)、top-k召回率(R@k)、归一化折损累计增益(NDCG)。本文提出一个用于鲁棒且可分解的排序损失优化的通用框架。该框架解决了基于排序损失的深度神经网络端到端训练中的两大挑战:不可微性与不可分解性。首先,我们提出排序算子的通用近似函数SupRank,该函数适用于随机梯度下降,并为排序损失提供上界,确保训练的鲁棒性。其次,我们采用一种简单而有效的损失函数,以缩小排序损失的批次平均近似与其在整个训练集上的真实值之间的可分解性差距。我们将该框架应用于图像检索中的两个标准指标:AP与R@k。此外,我们将框架拓展至层次化图像检索,引入AP的扩展形式——层次化平均精度$\mathcal{H}$-AP,并同时优化该指标与NDCG。最后,我们创建首个层次化地标检索数据集:通过半自动流水线生成层次化标签,对大规模Google地标v2数据集进行扩展。该层次化数据集公开发布于https://github.com/cvdfoundation/google-landmark,相关代码将于https://github.com/elias-ramzi/SupRank公开。