Generating safety-critical scenarios is essential for testing and verifying the safety of autonomous vehicles. Traditional optimization techniques suffer from the curse of dimensionality and limit the search space to fixed parameter spaces. To address these challenges, we propose a deep reinforcement learning approach that generates scenarios by sequential editing, such as adding new agents or modifying the trajectories of the existing agents. Our framework employs a reward function consisting of both risk and plausibility objectives. The plausibility objective leverages generative models, such as a variational autoencoder, to learn the likelihood of the generated parameters from the training datasets; It penalizes the generation of unlikely scenarios. Our approach overcomes the dimensionality challenge and explores a wide range of safety-critical scenarios. Our evaluation demonstrates that the proposed method generates safety-critical scenarios of higher quality compared with previous approaches.
翻译:生成安全关键场景对于测试和验证自动驾驶车辆的安全性至关重要。传统优化技术面临维度灾难的挑战,且搜索空间局限于固定参数空间。为解决这些问题,我们提出一种深度强化学习方法,通过顺序编辑(如添加新智能体或修改现有智能体的轨迹)生成场景。我们的框架采用包含风险和合理性双目标的奖励函数。合理性目标利用生成式模型(如变分自编码器)从训练数据集中学习生成参数的可能性,并惩罚不合理的场景生成。该方法克服了维度挑战,探索了广泛的安全关键场景。评估结果表明,与现有方法相比,所提方法能生成更高质量的安全关键场景。