The lack of transparency of Deep Neural Networks continues to be a limitation that severely undermines their reliability and usage in high-stakes applications. Promising approaches to overcome such limitations are Prototype-Based Self-Explainable Neural Networks (PSENNs), whose predictions rely on the similarity between the input at hand and a set of prototypical representations of the output classes, offering therefore a deep, yet transparent-by-design, architecture. So far, such models have been designed by considering pointwise estimates for the prototypes, which remain fixed after the learning phase of the model. In this paper, we introduce a probabilistic reformulation of PSENNs, called Prob-PSENN, which replaces point estimates for the prototypes with probability distributions over their values. This provides not only a more flexible framework for an end-to-end learning of prototypes, but can also capture the explanatory uncertainty of the model, which is a missing feature in previous approaches. In addition, since the prototypes determine both the explanation and the prediction, Prob-PSENNs allow us to detect when the model is making uninformed or uncertain predictions, and to obtain valid explanations for them. Our experiments demonstrate that Prob-PSENNs provide more meaningful and robust explanations than their non-probabilistic counterparts, thus enhancing the explainability and reliability of the models.
翻译:深度神经网络缺乏透明度,这一局限性严重削弱了其在高风险应用中的可靠性与实用性。为克服此类不足,基于原型的自解释神经网络(Prototype-Based Self-Explainable Neural Networks, PSENNs)展现出广阔前景——其预测结果依赖于输入样本与输出类别原型表征集的相似度,从而构建出既具深度又天然透明的架构。目前这类模型均采用原型点估计设计,这些原型在模型学习阶段后便保持固定不变。本文提出PSENN的概率化重构版本Prob-PSENN,用原型值上的概率分布替代点估计。这不仅为原型端到端学习提供了更灵活的框架,还能捕获模型解释的不确定性——这是先前方法缺失的关键特性。由于原型同时决定解释结果与预测结果,Prob-PSENN可有效检测模型在做出信息不足或不确定预测时的状态,并为此类预测提供有效解释。实验表明,相较于非概率化版本,Prob-PSENN能生成更具意义且更鲁棒的解释,从而提升模型的可解释性与可靠性。