The fusion of raw features from multiple sensors on an autonomous vehicle to create a Bird's Eye View (BEV) representation is crucial for planning and control systems. There is growing interest in using deep learning models for BEV semantic segmentation. Anticipating segmentation errors and improving the explainability of DNNs is essential for autonomous driving, yet it is under-studied. This paper introduces a benchmark for predictive uncertainty quantification in BEV segmentation. The benchmark assesses various approaches across three popular datasets using two representative backbones and focuses on the effectiveness of predicted uncertainty in identifying misclassified and out-of-distribution (OOD) pixels, as well as calibration. Empirical findings highlight the challenges in uncertainty quantification. Our results find that evidential deep learning based approaches show the most promise by efficiently quantifying aleatoric and epistemic uncertainty. We propose the Uncertainty-Focal-Cross-Entropy (UFCE) loss, designed for highly imbalanced data, which consistently improves the segmentation quality and calibration. Additionally, we introduce a vacuity-scaled regularization term that enhances the model's focus on high uncertainty pixels, improving epistemic uncertainty quantification.
翻译:自动驾驶车辆通过融合多个传感器的原始特征生成鸟瞰图表示,这对于规划与控制系统至关重要。目前,利用深度学习模型进行鸟瞰图语义分割的研究日益增多。然而,预测分割错误并提升深度神经网络的可解释性对于自动驾驶至关重要,但相关研究仍显不足。本文提出了一个鸟瞰图分割中预测不确定性量化的基准。该基准使用两种代表性骨干网络,在三个常用数据集上评估了多种方法,重点关注预测不确定性在识别误分类像素、分布外像素以及校准方面的有效性。实证结果突显了不确定性量化所面临的挑战。研究发现,基于证据深度学习的方法通过有效量化任意不确定性和认知不确定性,展现出最大的潜力。我们提出了专为高度不平衡数据设计的"不确定性-焦点-交叉熵"损失函数,该函数持续提升了分割质量与校准性能。此外,我们引入了一个基于空值缩放的规范化项,增强了模型对高不确定性像素的关注,从而改善了认知不确定性的量化。