Glaucoma is a major cause of irreversible blindness, with significant diagnostic subjectivity. This inherent uncertainty, combined with the overconfidence of models optimized solely for accuracy can lead to fatal issues such as overdiagnosis or missing critical diseases. To ensure clinical trust, model calibration is essential for reliable predictions, yet study in this field remains limited. Existing calibration study have overlooked glaucoma's systemic associations and high diagnostic subjectivity. To overcome these limitations, we propose V-ViT (Voting-based ViT), a framework that enhances calibration by integrating a patient's binocular information and metadata. Furthermore, to mitigate diagnostic subjectivity, V-ViT utilizes an iterative dropout-based Voting System to maximize calibration performance. The proposed framework achieved state-of-the-art performance across all metrics, including the primary calibration metrics. Our results demonstrate that V-ViT effectively resolves the issue of overconfidence in predictions in glaucoma diagnosis, providing highly reliable predictions for clinical use. Our source code is available at https://github.com/starforTJ/V-ViT.
翻译:青光眼是不可逆性失明的主要病因,其诊断具有显著的主观性。这种固有的不确定性,加上仅针对准确率优化的模型存在的过度自信问题,可能导致过度诊断或漏诊关键疾病等严重后果。为确保临床可信度,模型校准对于获得可靠预测至关重要,但该领域的研究仍较为有限。现有校准研究忽视了青光眼的系统性关联和高诊断主观性。为克服这些局限,我们提出了V-ViT(基于投票的视觉Transformer)框架,该框架通过整合患者的双目信息和元数据来增强校准效果。此外,为缓解诊断主观性,V-ViT采用基于迭代随机丢弃的投票系统以最大化校准性能。所提框架在所有指标上均取得了最先进的性能,包括核心校准指标。实验结果表明,V-ViT能有效解决青光眼诊断中预测过度自信的问题,为临床使用提供高可靠性预测。源代码已发布于https://github.com/starforTJ/V-ViT。