As deep neural networks are more commonly deployed in high-stakes domains, their lack of interpretability makes uncertainty quantification challenging. We investigate the effects of presenting conformal prediction sets$\unicode{x2013}$a method for generating valid confidence sets in distribution-free uncertainty quantification$\unicode{x2013}$to express uncertainty in AI-advised decision-making. Through a large online experiment, we compare the utility of conformal prediction sets to displays of Top-$1$ and Top-$k$ predictions for AI-advised image labeling. We find that the utility of prediction sets for accuracy varies with the difficulty of the task: while they result in accuracy on par with or less than Top-$1$ and Top-$k$ displays for easy images, prediction sets excel at assisting humans in labeling out-of-distribution (OOD) images especially when the set size is small. Our results empirically pinpoint the practical challenges of conformal prediction sets and provide implications on how to incorporate them for real-world decision-making.
翻译:随着深度神经网络在高风险领域的应用日益普及,其缺乏可解释性使得不确定性量化面临挑战。本研究通过大规模在线实验,探究在AI辅助决策中呈现共形预测集(一种在无分布假设下生成有效置信集的不确定性量化方法)对不确定性表达的影响。我们比较了共形预测集与Top-1及Top-k预测展示在AI辅助图像标注任务中的效用差异。研究发现:预测集对准确率的效用取决于任务难度——对于简单图像,其准确率与Top-1及Top-k展示持平或略低;但在标注分布外图像时,预测集(尤其当集合规模较小时)能显著提升人工标注效果。本实验结果实证揭示了共形预测集在实际应用中的挑战,并为将其纳入真实决策场景提供了实践启示。