Adversarial machine learning (AML) studies attacks that can fool machine learning algorithms into generating incorrect outcomes as well as the defenses against worst-case attacks to strengthen model robustness. Specifically for image classification, it is challenging to understand adversarial attacks due to their use of subtle perturbations that are not human-interpretable, as well as the variability of attack impacts influenced by diverse methodologies, instance differences, and model architectures. Through a design study with AML learners and teachers, we introduce AdvEx, a multi-level interactive visualization system that comprehensively presents the properties and impacts of evasion attacks on different image classifiers for novice AML learners. We quantitatively and qualitatively assessed AdvEx in a two-part evaluation including user studies and expert interviews. Our results show that AdvEx is not only highly effective as a visualization tool for understanding AML mechanisms, but also provides an engaging and enjoyable learning experience, thus demonstrating its overall benefits for AML learners.
翻译:对抗性机器学习(AML)研究能够欺骗机器学习算法产生错误结果的攻击,以及针对最坏情况攻击以增强模型鲁棒性的防御方法。在图像分类领域,由于对抗性攻击使用细微扰动(这些扰动不可被人类理解),并且攻击影响会因方法多样性、实例差异性和模型架构的差异而发生变化,因此理解此类攻击具有挑战性。通过与AML学习者和教师的设计研究,我们引入了AdvEx——一款多层级交互可视化系统,该系统全面呈现了针对不同图像分类器的逃逸攻击特性与影响,面向AML初学者。我们通过用户研究和专家访谈两部分评估,对AdvEx进行了定量与定性分析。结果表明,AdvEx不仅作为理解AML机制的可视化工具具有高效性,还能提供引人入胜且愉悦的学习体验,从而充分验证了其对AML学习者的综合价值。