The performance of neural networks in content-based image retrieval (CBIR) is highly influenced by the chosen loss (objective) function. The majority of objective functions for neural models can be divided into metric learning and statistical learning. Metric learning approaches require a pair mining strategy that often lacks efficiency, while statistical learning approaches are not generating highly compact features due to their indirect feature optimization. To this end, we propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimizes for the L2 metric without the need of generating pairs. Our loss is formed of three components. One leading objective ensures that the learned features are attracted to each designated learnable class anchor. The second loss component regulates the anchors and forces them to be separable by a margin, while the third objective ensures that the anchors do not collapse to zero. Furthermore, we develop a more efficient two-stage retrieval system by harnessing the learned class anchors during the first stage of the retrieval process, eliminating the need of comparing the query with every image in the database. We establish a set of four datasets (CIFAR-100, Food-101, SVHN, and Tiny ImageNet) and evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures. Compared to existing objective functions, our empirical evidence shows that the proposed objective is generating superior and more consistent results.
翻译:在基于内容的图像检索(CBIR)中,神经网络的性能深受所选损失函数(目标函数)的影响。神经模型的大多数目标函数可分为度量学习和统计学习两类。度量学习方法需要一种往往缺乏效率的样本对挖掘策略,而统计学习方法由于特征优化不直接,无法生成高度紧凑的特征。为此,我们提出一种新颖的排斥-吸引损失,它属于度量学习范式,但直接优化L2度量而无需生成样本对。我们的损失由三个部分组成:一个主要目标确保学习到的特征被吸引到各自指定的可学习类别锚点;第二个损失分量调控锚点并迫使其通过间隔保持可分离性;第三个目标确保锚点不会坍缩为零。此外,我们通过在第一阶段检索过程中利用学习到的类别锚点,开发了一种更高效的两阶段检索系统,消除了将查询与数据库中每个图像进行比较的需要。我们建立了四个数据集(CIFAR-100、Food-101、SVHN和Tiny ImageNet),并在CBIR任务的小样本和全样本训练场景下,使用卷积和Transformer两种架构评估了所提出的目标函数。与现有目标函数相比,我们的实证证据表明,所提出的目标函数能生成更优越且更一致的结果。