Fine-grained Visual Classification with High-temperature Refinement and Background Suppression

Fine-grained visual classification is a challenging task due to the high similarity between categories and distinct differences among data within one single category. To address the challenges, previous strategies have focused on localizing subtle discrepancies between categories and enhencing the discriminative features in them. However, the background also provides important information that can tell the model which features are unnecessary or even harmful for classification, and models that rely too heavily on subtle features may overlook global features and contextual information. In this paper, we propose a novel network called ``High-temperaturE Refinement and Background Suppression'' (HERBS), which consists of two modules, namely, the high-temperature refinement module and the background suppression module, for extracting discriminative features and suppressing background noise, respectively. The high-temperature refinement module allows the model to learn the appropriate feature scales by refining the features map at different scales and improving the learning of diverse features. And, the background suppression module first splits the features map into foreground and background using classification confidence scores and suppresses feature values in low-confidence areas while enhancing discriminative features. The experimental results show that the proposed HERBS effectively fuses features of varying scales, suppresses background noise, discriminative features at appropriate scales for fine-grained visual classification.The proposed method achieves state-of-the-art performance on the CUB-200-2011 and NABirds benchmarks, surpassing 93% accuracy on both datasets. Thus, HERBS presents a promising solution for improving the performance of fine-grained visual classification tasks. code: https://github.com/chou141253/FGVC-HERBS

翻译：细粒度视觉分类是一项具有挑战性的任务，原因在于不同类别之间高度相似，而同一类别内数据存在显著差异。为应对这些挑战，以往策略侧重于定位类别间的细微差异并增强其中的判别性特征。然而，背景信息也能为模型指明哪些特征对于分类是不必要甚至有害的，而过度依赖细微特征的模型可能忽略全局特征和上下文信息。本文提出一种名为"高温细化与背景抑制"（HERBS）的新型网络，包含两个模块：高温细化模块用于提取判别性特征，背景抑制模块用于抑制背景噪声。高温细化模块通过在不同尺度上细化特征图并改进多样化特征的学习，使模型能够学习到合适的特征尺度。背景抑制模块则首先利用分类置信度分数将特征图分为前景和背景，然后抑制低置信度区域的特征值，同时增强判别性特征。实验结果表明，所提出的HERBS能有效融合不同尺度的特征、抑制背景噪声，并在合适尺度上学习判别性特征以实现细粒度视觉分类。该方法在CUB-200-2011和NABirds基准上达到了最优性能，在两个数据集上准确率均超过93%。因此，HERBS为提升细粒度视觉分类任务性能提供了一种有前景的解决方案。代码：https://github.com/chou141253/FGVC-HERBS