Quantitative tools are increasingly appealing for decision support in healthcare, driven by the growing capabilities of advanced AI systems. However, understanding the predictive uncertainties surrounding a tool's output is crucial for decision-makers to ensure reliable and transparent decisions. In this paper, we present a case study on pulmonary nodule detection for lung cancer screening, enhancing an advanced detection model with an uncertainty quantification technique called conformal risk control (CRC). We demonstrate that prediction sets with conformal guarantees are attractive measures of predictive uncertainty in the safety-critical healthcare domain, allowing end-users to achieve arbitrary validity by trading off false positives and providing formal statistical guarantees on model performance. Among ground-truth nodules annotated by at least three radiologists, our model achieves a sensitivity that is competitive with that generally achieved by individual radiologists, with a slight increase in false positives. Furthermore, we illustrate the risks of using off-the-shelve prediction models when faced with ontological uncertainty, such as when radiologists disagree on what constitutes the ground truth on pulmonary nodules.
翻译:随着先进人工智能系统能力的不断增强,量化工具在医疗保健决策支持中的应用日益受到青睐。然而,决策者理解工具输出所伴随的预测不确定性,对于确保决策可靠且透明至关重要。本文以肺癌筛查中的肺结节检测为例,通过一种称为保形风险控制的不确定性量化技术,对一种先进的检测模型进行了增强。我们证明,具有保形保证的预测集在安全至上的医疗领域是极具吸引力的预测不确定性度量方法,它允许终端用户通过权衡假阳性率来实现任意的有效性,并为模型性能提供正式的统计保证。在至少由三位放射科医师标注的真实结节中,我们的模型实现了与单个放射科医师通常达到的灵敏度相竞争的水平,同时假阳性率略有增加。此外,我们阐述了在面对本体不确定性(例如放射科医师对肺结节真实标注标准存在分歧时)使用现成预测模型所存在的风险。