ReMAV: Reward Modeling of Autonomous Vehicles for Finding Likely Failure Events

Autonomous vehicles are advanced driving systems that are well known for being vulnerable to various adversarial attacks, compromising the vehicle's safety, and posing danger to other road users. Rather than actively training complex adversaries by interacting with the environment, there is a need to first intelligently find and reduce the search space to only those states where autonomous vehicles are found less confident. In this paper, we propose a blackbox testing framework ReMAV using offline trajectories first to analyze the existing behavior of autonomous vehicles and determine appropriate thresholds for finding the probability of failure events. Our reward modeling technique helps in creating a behavior representation that allows us to highlight regions of likely uncertain behavior even when the baseline autonomous vehicle is performing well. This approach allows for more efficient testing without the need for computational and inefficient active adversarial learning techniques. We perform our experiments in a high-fidelity urban driving environment using three different driving scenarios containing single and multi-agent interactions. Our experiment shows 35%, 23%, 48%, and 50% increase in occurrences of vehicle collision, road objects collision, pedestrian collision, and offroad steering events respectively by the autonomous vehicle under test, demonstrating a significant increase in failure events. We also perform a comparative analysis with prior testing frameworks and show that they underperform in terms of training-testing efficiency, finding total infractions, and simulation steps to identify the first failure compared to our approach. The results show that the proposed framework can be used to understand existing weaknesses of the autonomous vehicles under test in order to only attack those regions, starting with the simplistic perturbation models.

翻译：自动驾驶车辆作为先进驾驶系统，易受各类对抗攻击影响而危及行车安全，并对其他道路使用者构成威胁。与其通过与环境交互主动训练复杂对抗策略，更需优先智能地缩减搜索空间，仅聚焦于自动驾驶车辆置信度较低的状态。本文提出黑盒测试框架ReMAV，首先利用离线轨迹分析自动驾驶车辆现有行为特征，确立发现失效事件概率的合适阈值。所提出的奖励建模技术能够构建行为表征，即使基础自动驾驶车辆表现良好时，也能突出显示可能存在不确定行为的区域。该方法无需依赖计算量大的主动对抗学习技术，即可实现高效测试。我们在高保真城市驾驶环境中，基于包含单智能体与多智能体交互的三种驾驶场景开展实验。测试结果表明，被测自动驾驶车辆的车辆碰撞、道路物体碰撞、行人碰撞及偏离道路事件分别增加35%、23%、48%和50%，失效事件显著上升。通过与既有测试框架的对比分析发现，本方法在训练测试效率、违规发现总量及首次故障识别所需模拟步数方面均优于对比方案。实验证明，所提框架可有效解析被测自动驾驶车辆的现有薄弱环节，从而仅需采用简单扰动模型即可针对性攻击这些区域。