Adversarial Machine Learning (AML) represents the ability to disrupt Machine Learning (ML) algorithms through a range of methods that broadly exploit the architecture of deep learning optimisation. This paper presents Distributed Adversarial Regions (DAR), a novel method that implements distributed instantiations of computer vision-based AML attack methods that may be used to disguise objects from image recognition in both white and black box settings. We consider the context of object detection models used in urban environments, and benchmark the MobileNetV2, NasNetMobile and DenseNet169 models against a subset of relevant images from the ImageNet dataset. We evaluate optimal parameters (size, number and perturbation method), and compare to state-of-the-art AML techniques that perturb the entire image. We find that DARs can cause a reduction in confidence of 40.4% on average, but with the benefit of not requiring the entire image, or the focal object, to be perturbed. The DAR method is a deliberately simple approach where the intention is to highlight how an adversary with very little skill could attack models that may already be productionised, and to emphasise the fragility of foundational object detection models. We present this as a contribution to the field of ML security as well as AML. This paper contributes a novel adversarial method, an original comparison between DARs and other AML methods, and frames it in a new context - that of urban camouflage and the necessity for ML security and model robustness.
翻译:摘要:对抗性机器学习(AML)是指通过一系列广泛利用深度学习优化架构的方法来破坏机器学习(ML)算法的能力。本文提出了分布式对抗区域(DAR),一种新颖的方法,它实现了基于计算机视觉的AML攻击方法的分布式实例化,可在白盒和黑盒设置中用于伪装图像识别中的物体。我们考虑了城市环境中使用的物体检测模型,并针对ImageNet数据集中相关图像子集评估了MobileNetV2、NasNetMobile和DenseNet169模型。我们评估了最优参数(大小、数量和扰动方法),并与扰动整个图像的先进AML技术进行了比较。我们发现,DARs平均可使置信度降低40.4%,但其优势在于无需扰动整个图像或焦点物体。DAR方法是一种刻意简单的方法,旨在强调技能有限的对手如何攻击可能已投入生产的模型,并突出基础物体检测模型的脆弱性。我们将此作为对ML安全领域以及AML领域的贡献进行呈现。本文贡献了一种新颖的对抗方法,首次对DARs与其他AML方法进行了比较,并将其置于一个新的背景——城市伪装以及ML安全与模型稳健性的必要性之下。