Generating safety-critical scenarios is essential for testing and verifying the safety of autonomous vehicles. Traditional optimization techniques suffer from the curse of dimensionality and limit the search space to fixed parameter spaces. To address these challenges, we propose a deep reinforcement learning approach that generates scenarios by sequential editing, such as adding new agents or modifying the trajectories of the existing agents. Our framework employs a reward function consisting of both risk and plausibility objectives. The plausibility objective leverages generative models, such as a variational autoencoder, to learn the likelihood of the generated parameters from the training datasets; It penalizes the generation of unlikely scenarios. Our approach overcomes the dimensionality challenge and explores a wide range of safety-critical scenarios. Our evaluation demonstrates that the proposed method generates safety-critical scenarios of higher quality compared with previous approaches.
翻译:生成安全关键场景对于测试和验证自动驾驶汽车的安全性至关重要。传统的优化技术受到维度灾难的困扰,并将搜索空间限制在固定的参数空间内。为应对这些挑战,我们提出一种深度强化学习方法,通过顺序编辑(如添加新智能体或修改现有智能体的轨迹)来生成场景。我们的框架采用包含风险性和合理性双重目标的奖励函数。合理性目标利用生成模型(如变分自编码器)从训练数据集中学习生成参数的可能性,并惩罚不合理场景的生成。我们的方法克服了维度挑战,探索了广泛的安全关键场景。评估结果表明,与先前方法相比,所提出的方法能够生成更高质量的安全关键场景。