Deep Reinforcement Learning (DRL) enhances the efficiency of Autonomous Vehicles (AV), but also makes them susceptible to backdoor attacks that can result in traffic congestion or collisions. Backdoor functionality is typically incorporated by contaminating training datasets with covert malicious data to maintain high precision on genuine inputs while inducing the desired (malicious) outputs for specific inputs chosen by adversaries. Current defenses against backdoors mainly focus on image classification using image-based features, which cannot be readily transferred to the regression task of DRL-based AV controllers since the inputs are continuous sensor data, i.e., the combinations of velocity and distance of AV and its surrounding vehicles. Our proposed method adds well-designed noise to the input to neutralize backdoors. The approach involves learning an optimal smoothing (noise) distribution to preserve the normal functionality of genuine inputs while neutralizing backdoors. By doing so, the resulting model is expected to be more resilient against backdoor attacks while maintaining high accuracy on genuine inputs. The effectiveness of the proposed method is verified on a simulated traffic system based on a microscopic traffic simulator, where experimental results showcase that the smoothed traffic controller can neutralize all trigger samples and maintain the performance of relieving traffic congestion
翻译:深度强化学习(DRL)提升了自动驾驶车辆(AV)的效率,但也使其易受后门攻击,此类攻击可能导致交通拥堵或碰撞。后门功能通常通过污染训练数据集、嵌入隐蔽恶意数据来实现,从而在保持对正常输入高精度的同时,对攻击者选定的特定输入产生预期的恶意输出。当前针对后门的防御主要聚焦于基于图像特征的图像分类任务,但这些方法难以直接迁移至基于DRL的AV控制器的回归任务中,因为其输入为连续传感器数据(即AV及其周围车辆的速度与距离组合)。本文提出的方法通过向输入中添加精心设计的噪声来中和后门。该方法的核心在于学习一个最优平滑(噪声)分布,从而在保持正常输入原有功能的同时消除后门影响。经此处理后的模型预期能够更有效地抵御后门攻击,同时保持对正常输入的高精度。基于微观交通仿真器的模拟交通系统实验结果表明,经平滑处理的交通控制器能够中和所有触发样本,并维持缓解交通拥堵的性能。