Adversarial attacks pose a significant challenge to the reliable deployment of machine learning models in EdgeAI applications, such as autonomous driving and surveillance, which rely on resource-constrained devices for real-time inference. Among these, patch-based adversarial attacks, where small malicious patches (e.g., stickers) are applied to objects, can deceive neural networks into making incorrect predictions with potentially severe consequences. In this paper, we present PatchBlock, a lightweight framework designed to detect and neutralize adversarial patches in images. Leveraging outlier detection and dimensionality reduction, PatchBlock identifies regions affected by adversarial noise and suppresses their impact. It operates as a pre-processing module at the sensor level, efficiently running on CPUs in parallel with GPU inference, thus preserving system throughput while avoiding additional GPU overhead. The framework follows a three-stage pipeline: splitting the input into chunks (Chunking), detecting anomalous regions via a redesigned isolation forest with targeted cuts for faster convergence (Separating), and applying dimensionality reduction on the identified outliers (Mitigating). PatchBlock is both model- and patch-agnostic, can be retrofitted to existing pipelines, and integrates seamlessly between sensor inputs and downstream models. Evaluations across multiple neural architectures, benchmark datasets, attack types, and diverse edge devices demonstrate that PatchBlock consistently improves robustness, recovering up to 77% of model accuracy under strong patch attacks such as the Google Adversarial Patch, while maintaining high portability and minimal clean accuracy loss. Additionally, PatchBlock outperforms the state-of-the-art defenses in efficiency, in terms of computation time and energy consumption per sample, making it suitable for EdgeAI applications.
翻译:对抗性攻击对机器学习模型在边缘AI应用中的可靠部署构成了重大挑战,例如自动驾驶和监控等领域依赖资源受限设备进行实时推理。其中,基于补丁的对抗性攻击通过在物体上粘贴小型恶意补丁(如贴纸),能够欺骗神经网络做出错误预测,可能导致严重后果。本文提出PatchBlock,一种专为检测并消除图像中对抗性补丁而设计的轻量级框架。该框架利用异常值检测与降维技术,识别受对抗性噪声影响的区域并抑制其影响。PatchBlock作为传感器级的预处理模块运行,可在CPU上高效并行于GPU推理过程,从而在避免额外GPU开销的同时保持系统吞吐量。该框架采用三阶段流程:将输入分割为数据块(分块)、通过重新设计的隔离森林算法进行异常区域检测(该算法采用定向切割以实现更快收敛,即分离阶段),以及对识别出的异常值实施降维处理(缓解阶段)。PatchBlock具有模型无关性与补丁无关性,可适配现有处理流程,并能在传感器输入与下游模型之间无缝集成。通过在多种神经网络架构、基准数据集、攻击类型及多样化边缘设备上的评估表明,PatchBlock能持续提升系统鲁棒性,在Google对抗性补丁等强攻击下最高可恢复模型77%的准确率,同时保持高可移植性与极低的干净样本准确率损失。此外,PatchBlock在单样本计算时间与能耗方面均优于当前最先进的防御方法,使其特别适用于边缘AI应用场景。