As deep neural networks are more commonly deployed in high-stakes domains, their black-box nature makes uncertainty quantification challenging. We investigate the effects of presenting conformal prediction sets -- a distribution-free class of methods for generating prediction sets with specified coverage -- to express uncertainty in AI-advised decision-making. Through a large online experiment, we compare the utility of conformal prediction sets to displays of Top-$1$ and Top-$k$ predictions for AI-advised image labeling. In a pre-registered analysis, we find that the utility of prediction sets for accuracy varies with the difficulty of the task: while they result in accuracy on par with or less than Top-$1$ and Top-$k$ displays for easy images, prediction sets excel at assisting humans in labeling out-of-distribution (OOD) images, especially when the set size is small. Our results empirically pinpoint practical challenges of conformal prediction sets and provide implications on how to incorporate them for real-world decision-making.
翻译:随着深度神经网络在高风险领域的应用日益广泛,其黑箱特性使得不确定性量化面临挑战。本研究探讨了共形预测集——一种用于生成具有指定覆盖率的预测集的分布无关方法——在AI辅助决策中表达不确定性的效果。通过大规模在线实验,我们将共形预测集与Top-$1$和Top-$k$预测展示在AI辅助图像标注中的效用进行了比较。在预先注册的分析中,我们发现预测集在准确性方面的效用随任务难度而变化:对于简单图像,其准确性与Top-$1$和Top-$k$展示持平或更低;而在辅助人类标注分布外(Out-of-Distribution, OOD)图像时,预测集表现出显著优势,尤其在集规模较小的情况下。我们的实证结果明确了共形预测集在实际应用中面临的挑战,并为其在现实世界决策中的整合提供了启示。