Prediction sets can wrap around any ML model to cover unknown test outcomes with a guaranteed probability. Yet, it remains unclear how to use them optimally for downstream decision-making. Here, we propose a decision-theoretic framework that seeks to minimize the expected loss (risk) against a worst-case distribution consistent with the prediction set's coverage guarantee. We first characterize the minimax optimal policy for a fixed prediction set, showing that it balances the worst-case loss inside the set with a penalty for potential losses outside the set. Building on this, we derive the optimal prediction set construction that minimizes the resulting robust risk subject to a coverage constraint. Finally, we introduce Risk-Optimal Conformal Prediction (ROCP), a practical algorithm that targets these risk-minimizing sets while maintaining finite-sample distribution-free marginal coverage. Empirical evaluations on medical diagnosis and safety-critical decision-making tasks demonstrate that ROCP reduces critical mistakes compared to baselines, particularly when out-of-set errors are costly.
翻译:预测集能够包裹任何机器学习模型,以有保证的概率覆盖未知测试结果。然而,如何将其最优地用于下游决策制定仍不明确。本文提出一个决策理论框架,旨在最小化与预测集覆盖保证一致的最坏情况分布下的期望损失(风险)。我们首先刻画了固定预测集下的极小极大最优策略,表明该策略平衡了集合内的最坏情况损失与集合外潜在损失的惩罚。在此基础上,我们推导出在覆盖约束下最小化最终稳健风险的最优预测集构造方法。最后,我们提出风险最优的保形预测(ROCP)——一种实用算法,该算法在维持有限样本无分布边际覆盖的同时,以这些风险最小化集合为目标。在医疗诊断和安全关键决策任务上的实证评估表明,与基线方法相比,ROCP减少了关键错误,尤其在集合外错误代价高昂时效果显著。