Deep neural networks (DNNs) have become integral to a wide range of scientific and practical applications due to their flexibility and strong predictive performance. Despite their accuracy, however, DNNs frequently exhibit poor calibration, often assigning overly confident probabilities to incorrect predictions. This limitation underscores the growing need for integrated mechanisms that provide reliable uncertainty estimation. In this article, we compare two prominent approaches for uncertainty quantification: a Bayesian approximation via Monte Carlo Dropout and the nonparametric Conformal Prediction framework. Both methods are assessed using two convolutional neural network architectures; H-CNN VGG16 and GoogLeNet, trained on the Fashion-MNIST dataset. The empirical results show that although H-CNN VGG16 attains higher predictive accuracy, it tends to exhibit pronounced overconfidence, whereas GoogLeNet yields better-calibrated uncertainty estimates. Conformal Prediction additionally demonstrates consistent validity by producing statistically guaranteed prediction sets, highlighting its practical value in high-stakes decision-making contexts. Overall, the findings emphasize the importance of evaluating model performance beyond accuracy alone and contribute to the development of more reliable and trustworthy deep learning systems.
翻译:深度神经网络(DNNs)凭借其灵活性和强大的预测性能,已成为众多科学与实际应用中不可或缺的组成部分。然而,尽管其准确率很高,DNNs 常常表现出较差的校准性,经常为错误的预测分配过度自信的概率。这一局限性凸显了对能够提供可靠不确定性估计的集成机制日益增长的需求。本文比较了两种用于不确定性量化的主流方法:基于蒙特卡洛Dropout的贝叶斯近似方法,以及非参数的Conformal Prediction框架。两种方法均使用两种卷积神经网络架构进行评估:在Fashion-MNIST数据集上训练的H-CNN VGG16和GoogLeNet。实证结果表明,虽然H-CNN VGG16获得了更高的预测准确率,但它倾向于表现出明显的过度自信,而GoogLeNet则能产生校准更好的不确定性估计。此外,Conformal Prediction通过生成具有统计保证的预测集,证明了其具有一致的有效性,突显了其在高风险决策场景中的实用价值。总体而言,研究结果强调了超越单一准确率指标来评估模型性能的重要性,并为开发更可靠、更可信赖的深度学习系统做出了贡献。