Generating safety-critical scenarios is essential for testing and verifying the safety of autonomous vehicles. Traditional optimization techniques suffer from the curse of dimensionality and limit the search space to fixed parameter spaces. To address these challenges, we propose a deep reinforcement learning approach that generates scenarios by sequential editing, such as adding new agents or modifying the trajectories of the existing agents. Our framework employs a reward function consisting of both risk and plausibility objectives. The plausibility objective leverages generative models, such as a variational autoencoder, to learn the likelihood of the generated parameters from the training datasets; It penalizes the generation of unlikely scenarios. Our approach overcomes the dimensionality challenge and explores a wide range of safety-critical scenarios. Our evaluation demonstrates that the proposed method generates safety-critical scenarios of higher quality compared with previous approaches.
翻译:生成安全关键场景对于测试和验证自动驾驶车辆的安全性至关重要。传统优化技术受限于维度灾难,并将搜索空间限制在固定的参数空间内。为解决这些挑战,我们提出了一种深度强化学习方法,通过顺序编辑(如添加新智能体或修改现有智能体的轨迹)来生成场景。我们的框架采用包含风险与合理性目标的奖励函数。其中,合理性目标利用生成模型(如变分自编码器)从训练数据集中学习生成参数的似然性,并对生成不合理场景的行为施加惩罚。该方法克服了维度挑战,探索了广泛的安全关键场景。评估结果表明,相比先前方法,所提方法能生成更高质量的安全关键场景。