Adaptive Testing Environment Generation for Connected and Automated Vehicles with Dense Reinforcement Learning

The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prior knowledge can significantly diminish the evaluation efficiency. In response to this issue, existing studies predominantly concentrate on the adaptive design of testing scenarios during the CAV testing process. Yet, these methods have limitations in their applicability to high-dimensional scenarios. To overcome this challenge, we develop an adaptive testing environment that bolsters evaluation robustness by incorporating multiple surrogate models and optimizing the combination coefficients of these surrogate models to enhance evaluation efficiency. We formulate the optimization problem as a regression task utilizing quadratic programming. To efficiently obtain the regression target via reinforcement learning, we propose the dense reinforcement learning method and devise a new adaptive policy with high sample efficiency. Essentially, our approach centers on learning the values of critical scenes displaying substantial surrogate-to-real gaps. The effectiveness of our method is validated in high-dimensional overtaking scenarios, demonstrating that our approach achieves notable evaluation efficiency.

翻译：安全性能评估在网联自动驾驶车辆（CAVs）的开发与部署中发挥着关键作用。常规方法基于CAVs的先验知识（如代理模型）设计测试场景，在这些场景中执行测试，进而评估CAVs的安全性能。然而，CAVs与先验知识之间的显著差异会大幅降低评估效率。针对这一问题，现有研究主要集中于CAV测试过程中测试场景的自适应设计。但这些方法在高维场景中存在适用性局限。为克服这一挑战，我们开发了一种自适应测试环境，通过集成多个代理模型并优化其组合系数来增强评估鲁棒性与效率。我们将该优化问题构建为基于二次规划的回归任务。为通过强化学习高效获取回归目标，我们提出了密集强化学习方法，并设计了一种具有高样本效率的新型自适应策略。本质上，该方法聚焦于学习那些代理模型与真实环境存在显著差异的关键场景的价值。该方法在超车等高维场景中的有效性得到了验证，结果表明我们的方法实现了显著的评估效率。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日