CLIP-based foreground-background (FG-BG) decomposition methods have demonstrated remarkable effectiveness in improving few-shot out-of-distribution (OOD) detection performance. However, existing approaches still suffer from several limitations. For background regions obtained from decomposition, existing methods adopt a uniform suppression strategy for all patches, overlooking the varying contributions of different patches to the prediction. For foreground regions, existing methods fail to adequately consider that some local patches may exhibit appearance or semantic similarity to other classes, which may mislead the training process. To address these issues, we propose a new plug-and-play framework. This framework consists of three core components: (1) a Foreground-Background Decomposition module, which follows previous FG-BG methods to separate an image into foreground and background regions; (2) an Adaptive Background Suppression module, which adaptively weights patch classification entropy; and (3) a Confusable Foreground Rectification module, which identifies and rectifies confusable foreground patches. Extensive experimental results demonstrate that the proposed plug-and-play framework significantly improves the performance of existing FG-BG decomposition methods. Code is available at: https://github.com/lounwb/FoBoR.
翻译:基于CLIP的前景-背景分解方法在提升少样本分布外检测性能方面已展现出显著效果。然而,现有方法仍存在若干局限。对于分解获得的背景区域,现有方法对所有图像块采用统一的抑制策略,忽视了不同图像块对预测的贡献差异。对于前景区域,现有方法未能充分考虑某些局部图像块可能与其他类别在外观或语义上存在相似性,从而可能误导训练过程。为解决这些问题,我们提出了一种新的即插即用框架。该框架包含三个核心组件:(1) 前景-背景分解模块,沿用先前FG-BG方法将图像分离为前景与背景区域;(2) 自适应背景抑制模块,基于图像块分类熵进行自适应加权;(3) 易混淆前景校正模块,用于识别并校正易混淆的前景图像块。大量实验结果表明,所提出的即插即用框架显著提升了现有前景-背景分解方法的性能。代码发布于:https://github.com/lounwb/FoBoR。