ReMAV: Reward Modeling of Autonomous Vehicles for Finding Likely Failure Events

Autonomous vehicles are advanced driving systems that are well known to be vulnerable to various adversarial attacks, compromising vehicle safety and posing a risk to other road users. Rather than actively training complex adversaries by interacting with the environment, there is a need to first intelligently find and reduce the search space to only those states where autonomous vehicles are found to be less confident. In this paper, we propose a black-box testing framework ReMAV that uses offline trajectories first to analyze the existing behavior of autonomous vehicles and determine appropriate thresholds to find the probability of failure events. To this end, we introduce a three-step methodology which i) uses offline state action pairs of any autonomous vehicle under test, ii) builds an abstract behavior representation using our designed reward modeling technique to analyze states with uncertain driving decisions, and iii) uses a disturbance model for minimal perturbation attacks where the driving decisions are less confident. Our reward modeling technique helps in creating a behavior representation that allows us to highlight regions of likely uncertain behavior even when the standard autonomous vehicle performs well. We perform our experiments in a high-fidelity urban driving environment using three different driving scenarios containing single- and multi-agent interactions. Our experiment shows an increase in 35, 23, 48, and 50% in the occurrences of vehicle collision, road object collision, pedestrian collision, and offroad steering events, respectively by the autonomous vehicle under test, demonstrating a significant increase in failure events. We compare ReMAV with two baselines and show that ReMAV demonstrates significantly better effectiveness in generating failure events compared to the baselines in all evaluation metrics.

翻译：自主驾驶车辆作为先进驾驶系统，已知易受多种对抗性攻击，从而危及车辆安全并对其他道路使用者构成风险。与其通过与环境交互主动训练复杂对抗体，更需优先智能定位并缩减搜索空间，仅聚焦于自主车辆置信度较低的驾驶状态。本文提出黑盒测试框架ReMAV，首先利用离线轨迹分析自主车辆的现有行为特征，确定概率阈值以发现故障事件可能性。为此，我们引入三步法：i）利用被测自主车辆的离线状态-动作对；ii）通过设计的奖励建模技术构建抽象行为表征，分析决策不确定的驾驶状态；iii）对决策置信度不足的工况，采用最小扰动攻击的干扰模型。该奖励建模技术可构建行为表征，即使在自主车辆表现良好时也能标识潜在不确定行为区域。我们在高保真城市驾驶环境中，基于含单智能体及多智能体交互的三种驾驶场景进行实验。结果表明，被测自主车辆的车辆碰撞、道路物体碰撞、行人碰撞及偏离道路事件发生率分别提升35%、23%、48%及50%，显著增加了故障事件发生频次。与两种基线方法的对比显示，ReMAV在所有评估指标上均展现出更优的故障事件生成效能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日