With the explosive growth in the number of fine-grained images in the Internet era, it has become a challenging problem to perform fast and efficient retrieval from large-scale fine-grained images. Among the many retrieval methods, hashing methods are widely used due to their high efficiency and small storage space occupation. Fine-grained hashing is more challenging than traditional hashing problems due to the difficulties such as low inter-class variances and high intra-class variances caused by the characteristics of fine-grained images. To improve the retrieval accuracy of fine-grained hashing, we propose a cascaded network to learn compact and highly semantic hash codes, and introduce an attention-guided data augmentation method. We refer to this network as a cascaded hierarchical data augmentation network. We also propose a novel approach to coordinately balance the loss of multi-task learning. We do extensive experiments on some common fine-grained visual classification datasets. The experimental results demonstrate that our proposed method outperforms several state-of-art hashing methods and can effectively improve the accuracy of fine-grained retrieval. The source code is publicly available: https://github.com/kaiba007/FG-CNET.
翻译:随着互联网时代细粒度图像数量的爆炸式增长,如何从大规模细粒度图像中实现快速高效的检索已成为一个具有挑战性的问题。在众多检索方法中,哈希方法因其高效性和小存储空间占用而被广泛采用。由于细粒度图像的特性导致类间差异小、类内差异大等困难,细粒度哈希比传统哈希问题更具挑战性。为提高细粒度哈希的检索精度,我们提出了一种级联网络来学习紧凑且具有高语义性的哈希码,并引入了一种注意力引导的数据增强方法。我们将该网络称为级联分层数据增强网络。同时,我们提出了一种新颖的方法来协调平衡多任务学习的损失。我们在多个常见细粒度视觉分类数据集上进行了大量实验。实验结果表明,我们提出的方法优于多种现有最先进的哈希方法,并能有效提升细粒度检索的精度。源代码已公开:https://github.com/kaiba007/FG-CNET。