Predictions made by deep learning models are prone to data perturbations, adversarial attacks, and out-of-distribution inputs. To build a trusted AI system, it is therefore critical to accurately quantify the prediction uncertainties. While current efforts focus on improving uncertainty quantification accuracy and efficiency, there is a need to identify uncertainty sources and take actions to mitigate their effects on predictions. Therefore, we propose to develop explainable and actionable Bayesian deep learning methods to not only perform accurate uncertainty quantification but also explain the uncertainties, identify their sources, and propose strategies to mitigate the uncertainty impacts. Specifically, we introduce a gradient-based uncertainty attribution method to identify the most problematic regions of the input that contribute to the prediction uncertainty. Compared to existing methods, the proposed UA-Backprop has competitive accuracy, relaxed assumptions, and high efficiency. Moreover, we propose an uncertainty mitigation strategy that leverages the attribution results as attention to further improve the model performance. Both qualitative and quantitative evaluations are conducted to demonstrate the effectiveness of our proposed methods.
翻译:深度学习模型的预测易受数据扰动、对抗攻击和分布外输入的影响。为构建可信赖的人工智能系统,准确量化预测不确定性至关重要。当前研究聚焦于提升不确定性量化的精度与效率,但仍需识别不确定性来源并采取行动缓解其对预测的影响。为此,我们提出开发可解释且可操作的贝叶斯深度学习方法,不仅实现精准的不确定性量化,还能解释不确定性、识别其来源并提出缓解策略。具体而言,我们引入一种基于梯度的不确定性归因方法,用于识别导致预测不确定性的输入中最具问题的区域。与现有方法相比,所提出的UA-Backprop具有竞争性精度、宽松的假设条件和高效性。此外,我们提出一种利用归因结果作为注意力机制的不确定性缓解策略,以进一步改进模型性能。通过定性与定量评估,验证了所提出方法的有效性。