CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

We investigate the adversarial robustness of CNNs from the perspective of channel-wise activations. By comparing \textit{non-robust} (normally trained) and \textit{robustified} (adversarially trained) models, we observe that adversarial training (AT) robustifies CNNs by aligning the channel-wise activations of adversarial data with those of their natural counterparts. However, the channels that are \textit{negatively-relevant} (NR) to predictions are still over-activated when processing adversarial data. Besides, we also observe that AT does not result in similar robustness for all classes. For the robust classes, channels with larger activation magnitudes are usually more \textit{positively-relevant} (PR) to predictions, but this alignment does not hold for the non-robust classes. Given these observations, we hypothesize that suppressing NR channels and aligning PR ones with their relevances further enhances the robustness of CNNs under AT. To examine this hypothesis, we introduce a novel mechanism, i.e., \underline{C}hannel-wise \underline{I}mportance-based \underline{F}eature \underline{S}election (CIFS). The CIFS manipulates channels' activations of certain layers by generating non-negative multipliers to these channels based on their relevances to predictions. Extensive experiments on benchmark datasets including CIFAR10 and SVHN clearly verify the hypothesis and CIFS's effectiveness of robustifying CNNs.

翻译：我们从频道激活的角度来调查CNN的对抗性强度。通过比较\ textit{ non-robust} (通常经过培训) 和\ textit{robtified} (对抗性受过培训) 模式, 我们观察到, 对抗性培训(AT) 使CNN能够通过将频道驱动的对抗性数据与自然对应方的系统相匹配, 从而增强CNN的对抗性强度。然而, 在处理对称数据时, 预测的渠道仍然过于活跃。此外, 我们还观察到, AT 没有为所有类别带来类似的强度。对于强大的类来说, 具有较大激活程度的频道通常比预测更强 \ textit{ 积极相关} (PR), 但这种匹配对于非对立性类的对立性数据。然而, 我们低估了抑制NRR频道和使PR与它们的相关性在AT下的相关性进一步增强CNN的可靠性。为了检查这个假设, 我们引入了一个新的机制, i. CI, 直线和直线直线直线直线直线直线, 直线直线直线直线直线直线直线直线直线直线直线直线直线直线直线直线直线直线S 直线直线直线直线直线直线直线直线直线 SRFFISFS 直线直线直线。