Conformal triage converts predictive scores into deployment actions that either release a case, flag it for urgent attention, or defer it to human review. Under prevalence shift, however, the usual summaries of marginal coverage and human-review rate can miss the safety-critical question of whether patients who truly experience the target event are released without review. To address this gap, we introduce a leakage-aware deployment audit for release-side conformal triage. It first assigns target subjects to three non-overlapping roles: prevalence correction, conformal calibration, and held-out release-safety evaluation. This separation then lets the audit evaluate release directly: how many event-positive patients are cleared without review, whether the pilot has enough event labels for calibration, and how the safety-review trade-off shifts. Applying this audit to a retrospective NSCLC pilot shows why lower review can be misleading: after prevalence correction, the pooled conformal branch lowers review by releasing more patients, some of whom are event-positive. Within the audit, the classwise branch acts as a scarcity diagnostic: the pilot has too few event labels to certify safe low-review release.
翻译:摘要:共形分流将预测评分转化为部署动作,包括释放案例、标记为紧急关注或转交人工审查。然而,在患病率偏移下,边际覆盖率和人工审查率的常规统计可能忽略一个安全关键问题:真正经历目标事件的患者是否未经审查就被释放。为弥补这一缺陷,我们提出一种针对发布侧共形分流的泄漏感知部署审计。该方法首先将目标对象分配至三个非重叠角色:患病率校正、共形校准和保留的发布安全评估。这种分离使审计能够直接评估发布行为:有多少事件阳性患者在无审查情况下被清除、试点项目是否有足够的事件标签用于校准、以及安全-审查权衡如何变化。将本审计应用于回顾性非小细胞肺癌试点项目表明,较低审查率可能具有误导性:患病率校正后,合并共形分支通过释放更多患者(包括部分事件阳性者)降低了审查率。在审计内部,分类分支作为稀缺性诊断工具:试点项目中事件标签过少,无法验证低审查率下安全释放的可靠性。