Open-set recognition (OSR), the identification of novel categories, can be a critical component when deploying classification models in real-world applications. Recent work has shown that familiarity-based scoring rules such as the Maximum Softmax Probability (MSP) or the Maximum Logit Score (MLS) are strong baselines when the closed-set accuracy is high. However, one of the potential weaknesses of familiarity-based OSR are adversarial attacks. Here, we study gradient-based adversarial attacks on familiarity scores for both types of attacks, False Familiarity and False Novelty attacks, and evaluate their effectiveness in informed and uninformed settings on TinyImageNet. Furthermore, we explore how novel and familiar samples react to adversarial attacks and formulate the adversarial reaction score as an alternative OSR scoring rule, which shows a high correlation with the MLS familiarity score.
翻译:开放集识别(OSR),即对未知类别的识别,是实际应用中部署分类模型的关键组成部分。近期研究表明,当闭集准确率较高时,基于熟悉度的评分规则(如最大软概率(MSP)或最大逻辑值评分(MLS))是强有力的基线方法。然而,基于熟悉度的OSR潜在弱点之一是对抗攻击。本文研究了针对两种攻击类型(错误熟悉度攻击与错误新颖性攻击)的、基于梯度的熟悉度评分对抗攻击,并在TinyImageNet数据集上评估了其在已知信息与未知信息设置下的有效性。此外,我们探究了新颖样本与熟悉样本对对抗攻击的反应差异,并构建了对抗反应评分作为替代的OSR评分规则,该规则显示出与MLS熟悉度评分的高度相关性。