A lot of effort is currently invested in safeguarding autonomous driving systems, which heavily rely on deep neural networks for computer vision. We investigate the coupling of different neural network calibration measures with a special focus on the Area Under the Sparsification Error curve (AUSE) metric. We elaborate on the well-known inconsistency in determining optimal calibration using the Expected Calibration Error (ECE) and we demonstrate similar issues for the AUSE, the Uncertainty Calibration Score (UCS), as well as the Uncertainty Calibration Error (UCE). We conclude that the current methodologies leave a degree of freedom, which prevents a unique model calibration for the homologation of safety-critical functionalities. Furthermore, we propose the AUSE as an indirect measure for the residual uncertainty, which is irreducible for a fixed network architecture and is driven by the stochasticity in the underlying data generation process (aleatoric contribution) as well as the limitation in the hypothesis space (epistemic contribution).
翻译:当前,为确保自动驾驶系统的安全性投入了大量精力,这些系统严重依赖深度神经网络进行计算机视觉处理。本文研究了不同神经网络校准度量之间的耦合关系,特别聚焦于稀疏化误差曲线下面积(AUSE)指标。我们详细阐述了使用期望校准误差(ECE)确定最优校准时存在的不一致性问题,并证明了AUSE、不确定性校准分数(UCS)以及不确定性校准误差(UCE)同样存在类似问题。我们的结论表明,现有方法存在自由度缺陷,导致无法为安全关键功能的认证提供唯一确定的模型校准方案。此外,我们提出将AUSE作为残余不确定性的间接度量指标,这种不确定性对于固定网络架构具有不可约性,其来源既包括底层数据生成过程的随机性(偶然性成分),也受假设空间局限性(认知性成分)的影响。