Distribution-informed Efficient Conformal Prediction for Full Ranking

Quantifying uncertainty is critical for the safe deployment of ranking models in real-world applications. Recent work offers a rigorous solution using conformal prediction in a full ranking scenario, which aims to construct prediction sets for the absolute ranks of test items based on the relative ranks of calibration items. However, relying on upper bounds of non-conformity scores renders the method overly conservative, resulting in substantially large prediction sets. To address this, we propose Distribution-informed Conformal Ranking (DCR), which produces efficient prediction sets by deriving the exact distribution of non-conformity scores. In particular, we find that the absolute ranks of calibration items follow Negative Hypergeometric distributions, conditional on their relative ranks. DCR thus uses the rank distribution to derive non-conformity score distribution and determine conformal thresholds. We provide theoretical guarantees that DCR achieves improved efficiency over the baseline while ensuring valid coverage under mild assumptions. Extensive experiments demonstrate the superiority of DCR, reducing average prediction set size by up to 36%, while maintaining valid coverage.

翻译：在现实应用中，量化不确定性对于排序模型的安全部署至关重要。近期研究提出了一种在全排序场景下使用保形预测的严格解决方案，其目标是根据校准项目的相对排序构建测试项目绝对排序的预测集。然而，该方法依赖非保形分数上界的设定导致其过于保守，从而产生过大的预测集。为解决此问题，我们提出基于分布信息的保形排序方法，该方法通过推导非保形分数的精确分布来生成高效的预测集。具体而言，我们发现校准项目的绝对排序在给定其相对排序的条件下服从负超几何分布。因此，DCR利用该排序分布推导非保形分数分布并确定保形阈值。我们提供了理论保证，证明在温和假设下DCR在确保有效覆盖的同时，相比基线方法实现了效率提升。大量实验表明DCR具有显著优越性，在保持有效覆盖的前提下将平均预测集规模降低达36%。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

《鲁棒优化中保形预测生成不确定性集的性能评价》最新95页

专知会员服务

9+阅读 · 3月20日

【博士论文】深度序列模型中的概率学习与生成机制

专知会员服务

13+阅读 · 3月3日

保形时间序列预测入门指南

专知会员服务

15+阅读 · 2025年11月28日

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

专知会员服务

24+阅读 · 2025年8月14日