利用对抗性机器学习攻击自动驾驶智能体：基于CARLA排行榜的综合评估 (Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard)

To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom modules. Although numerous prior works have shown that adversarial examples can mislead ML models used in autonomous driving contexts, it remains unclear if these attacks are effective at producing harmful driving actions for various agents, environments, and scenarios. To assess the risk of adversarial examples to autonomous driving, we evaluate attacks against a variety of driving agents, rather than against ML models in isolation. To support this evaluation, we leverage CARLA, an urban driving simulator, to create and evaluate adversarial examples. We create adversarial patches designed to stop or steer driving agents, stream them into the CARLA simulator at runtime, and evaluate them against agents from the CARLA Leaderboard, a public repository of best-performing autonomous driving agents from an annual research competition. Unlike prior work, we evaluate attacks against autonomous driving systems without creating or modifying any driving-agent code and against all parts of the agent included with the ML model. We perform a case-study investigation of two attack strategies against three open-source driving agents from the CARLA Leaderboard across multiple driving scenarios, lighting conditions, and locations. Interestingly, we show that, although some attacks can successfully mislead ML models into predicting erroneous stopping or steering commands, some driving agents use modules, such as PID control or GPS-based rules, that can overrule attacker-manipulated predictions from ML models.

翻译：为实现车辆自主控制，驾驶智能体结合机器学习模型、控制器逻辑及定制模块的输出进行决策。尽管先前大量研究表明对抗样本可误导自动驾驶场景中的机器学习模型，但这些攻击是否能在不同智能体、环境及场景下有效引发危险驾驶行为仍不明确。为评估对抗样本对自动驾驶的风险，本研究针对多种驾驶智能体（而非孤立的机器学习模型）开展攻击评估。为此，我们利用城市驾驶模拟器CARLA构建并评估对抗样本：设计旨在迫使驾驶智能体停止或转向的对抗性补丁，在运行时将其注入CARLA模拟器，并针对来自CARLA排行榜（年度研究竞赛中高性能自动驾驶智能体的公共存储库）的智能体进行测试。与以往研究不同，本工作在不创建或修改任何驾驶智能体代码的前提下，对包含机器学习模型在内的智能体所有组件实施攻击评估。我们通过案例研究，在多种驾驶场景、光照条件及地理位置中，对CARLA排行榜中三个开源驾驶智能体执行两种攻击策略的测试。值得注意的是，研究发现：尽管部分攻击能成功误导机器学习模型预测出错误的停止或转向指令，但某些驾驶智能体通过PID控制或基于GPS的规则等模块，能够覆盖攻击者操纵的机器学习模型预测结果。