In 2017, Hughes claimed an equivalence between Tjurs $R^2$ coefficient of discrimination and Youden index for assessing diagnostic test performance on $2\times 2$ contingency tables. We prove an impossibility result when averaging over binary outcomes (0s and 1s) under any continuous real-valued scoring rule. Our findings clarify the limitations of such a possible equivalence and highlights the distinct roles these metrics play in diagnostic test assessment.
翻译:2017年,Hughes声称Tjur的$R^2$判别系数与约登指数在评估$2\times 2$列联表的诊断测试性能时具有等价性。我们证明,在任何连续实值评分规则下对二元结果(0和1)进行平均时,这种等价性是不可能的。我们的研究结果阐明了这种可能等价性的局限性,并强调了这些度量在诊断测试评估中所扮演的不同角色。