Uncertainty quantification (UQ) is an essential tool for applying deep neural networks (DNNs) to real world tasks, as it attaches a degree of confidence to DNN outputs. However, despite its benefits, UQ is often left out of the standard DNN workflow due to the additional technical knowledge required to apply and evaluate existing UQ procedures. Hence there is a need for a comprehensive toolbox that allows the user to integrate UQ into their modelling workflow, without significant overhead. We introduce \texttt{Lightning UQ Box}: a unified interface for applying and evaluating various approaches to UQ. In this paper, we provide a theoretical and quantitative comparison of the wide range of state-of-the-art UQ methods implemented in our toolbox. We focus on two challenging vision tasks: (i) estimating tropical cyclone wind speeds from infrared satellite imagery and (ii) estimating the power output of solar panels from RGB images of the sky. By highlighting the differences between methods our results demonstrate the need for a broad and approachable experimental framework for UQ, that can be used for benchmarking UQ methods. The toolbox, example implementations, and further information are available at: https://github.com/lightning-uq-box/lightning-uq-box
翻译:不确定性量化(UQ)是将深度神经网络(DNNs)应用于实际任务的重要工具,因为它能为DNN输出附加置信度。然而,尽管UQ具有诸多优势,由于应用和评估现有UQ流程需要额外的专业知识,它往往被排除在标准DNN工作流之外。因此,亟需一个综合性工具箱,使用户能够以较低成本将UQ集成到建模流程中。我们推出\texttt{Lightning UQ Box}:一个用于应用和评估各类UQ方法的统一接口。本文通过理论与量化对比,系统分析了工具箱中实现的多种前沿UQ方法。我们聚焦于两项具有挑战性的视觉任务:(1)基于红外卫星图像估算热带气旋风速;(2)通过天空RGB图像估算太阳能电池板发电功率。通过揭示不同方法间的差异,我们的研究结果表明需要建立一个广泛且易用的UQ实验框架,以用于UQ方法的基准测试。该工具箱、示例实现及更多信息可通过以下网址获取:https://github.com/lightning-uq-box/lightning-uq-box