A lot of effort is currently invested in safeguarding autonomous driving systems, which heavily rely on deep neural networks for computer vision. We investigate the coupling of different neural network calibration measures with a special focus on the Area Under the Sparsification Error curve (AUSE) metric. We elaborate on the well-known inconsistency in determining optimal calibration using the Expected Calibration Error (ECE) and we demonstrate similar issues for the AUSE, the Uncertainty Calibration Score (UCS), as well as the Uncertainty Calibration Error (UCE). We conclude that the current methodologies leave a degree of freedom, which prevents a unique model calibration for the homologation of safety-critical functionalities. Furthermore, we propose the AUSE as an indirect measure for the residual uncertainty, which is irreducible for a fixed network architecture and is driven by the stochasticity in the underlying data generation process (aleatoric contribution) as well as the limitation in the hypothesis space (epistemic contribution).
翻译:当前大量研究工作致力于保障自动驾驶系统的安全性,这些系统高度依赖用于计算机视觉的深度神经网络。本文重点研究不同神经网络校准指标之间的耦合关系,特别关注稀疏化误差曲线下面积(AUSE)度量。我们详细阐述了使用期望校准误差(ECE)确定最优校准时存在的不一致性问题,并证明AUSE、不确定性校准分数(UCS)以及不确定性校准误差(UCE)同样存在类似缺陷。研究结论表明,现有方法存在自由度冗余,导致无法为安全关键功能的认证提供唯一确定的模型校准方案。此外,我们提出将AUSE作为剩余不确定性的间接度量指标,这种不确定性对于固定网络架构具有不可约性,其来源既包括底层数据生成过程的随机性(偶然性分量),也受假设空间局限性(认知性分量)的影响。