Perception components in autonomous systems are often developed and optimized independently of downstream decision-making and control components, relying on established performance metrics like accuracy, precision, and recall. Traditional loss functions, such as cross-entropy loss and negative log-likelihood, focus on reducing misclassification errors but fail to consider their impact on system-level safety, overlooking the varying severities of system-level failures caused by these errors. To address this limitation, we propose a novel training paradigm that augments the perception component with an understanding of system-level safety objectives. Central to our approach is the translation of system-level safety requirements, formally specified using the rulebook formalism, into safety scores. These scores are then incorporated into the reward function of a reinforcement learning framework for fine-tuning perception models with system-level safety objectives. Simulation results demonstrate that models trained with this approach outperform baseline perception models in terms of system-level safety.
翻译:自动驾驶系统中的感知组件通常独立于下游决策与控制组件进行开发和优化,依赖准确性、精确率和召回率等既定性能指标。传统损失函数(如交叉熵损失和负对数似然)侧重于减少误分类错误,但未能考虑其对系统级安全的影响,忽略了这些错误导致的系统级故障在严重程度上的差异。为克服这一局限,我们提出一种新颖的训练范式,使感知组件能够理解系统级安全目标。该方法的核心在于将使用规则书形式化方法规范描述的系统级安全要求转化为安全评分,并将这些评分纳入强化学习框架的奖励函数中,从而基于系统级安全目标对感知模型进行微调。仿真结果表明,采用此方法训练的模型在系统级安全性能上优于基线感知模型。