From Attack to Defense: Insights into Deep Learning Security Measures in Black-Box Settings

Deep Learning (DL) is rapidly maturing to the point that it can be used in safety- and security-crucial applications. However, adversarial samples, which are undetectable to the human eye, pose a serious threat that can cause the model to misbehave and compromise the performance of such applications. Addressing the robustness of DL models has become crucial to understanding and defending against adversarial attacks. In this study, we perform comprehensive experiments to examine the effect of adversarial attacks and defenses on various model architectures across well-known datasets. Our research focuses on black-box attacks such as SimBA, HopSkipJump, MGAAttack, and boundary attacks, as well as preprocessor-based defensive mechanisms, including bits squeezing, median smoothing, and JPEG filter. Experimenting with various models, our results demonstrate that the level of noise needed for the attack increases as the number of layers increases. Moreover, the attack success rate decreases as the number of layers increases. This indicates that model complexity and robustness have a significant relationship. Investigating the diversity and robustness relationship, our experiments with diverse models show that having a large number of parameters does not imply higher robustness. Our experiments extend to show the effects of the training dataset on model robustness. Using various datasets such as ImageNet-1000, CIFAR-100, and CIFAR-10 are used to evaluate the black-box attacks. Considering the multiple dimensions of our analysis, e.g., model complexity and training dataset, we examined the behavior of black-box attacks when models apply defenses. Our results show that applying defense strategies can significantly reduce attack effectiveness. This research provides in-depth analysis and insight into the robustness of DL models against various attacks, and defenses.

翻译：深度学习技术正迅速成熟至可用于安全关键型应用的程度。然而，人眼无法察觉的对抗性样本构成严重威胁，可能导致模型行为异常并损害此类应用的性能。解决深度学习模型的鲁棒性已成为理解和防御对抗性攻击的关键。本研究通过全面实验，考察了对抗性攻击与防御对广泛使用的数据集上多种模型架构的影响。我们的研究聚焦于SimBA、HopSkipJump、MGAAttack及边界攻击等黑盒攻击，以及位压缩、中值平滑和JPEG滤波器等基于预处理器的防御机制。通过对各类模型的实验，结果表明：攻击所需噪声水平随模型层数增加而上升，同时攻击成功率随层数增加而下降，揭示模型复杂度与鲁棒性存在显著关联。在探究多样性与鲁棒性关系时，针对多样化模型的实验表明，参数量大并不意味鲁棒性更强。我们进一步通过ImageNet-1000、CIFAR-100和CIFAR-10等不同数据集评估黑盒攻击，揭示了训练数据集对模型鲁棒性的影响。综合考虑模型复杂度与训练数据集等多个分析维度，我们考察了模型应用防御时黑盒攻击的行为特征。结果显示，采用防御策略可显著降低攻击有效性。本研究为深度学习模型面对各类攻击与防御的鲁棒性提供了深度分析及重要洞见。