Ranking with Confidence: A Probabilistic Framework for Deterministic Ranking Methods

Rankings are central to decision-making in fields ranging from education to online platforms, yet classical deterministic methods such as the Borda count method or Copeland-type pairwise methods ignore uncertainty due to sampling noise or incomplete data. We propose a probabilistic framework that treats true ranks as latent random variables, enabling quantification of ranking uncertainty. We introduce new ranking criteria based on pairwise dominance probabilities, derive approximate inference procedures, and provide a novel Worst Best rank method to construct simultaneous and individual confidence intervals for ranks. Our approach is the first to provide formal uncertainty quantification for classical deterministic rankings. It is inherently robust to missing data: unlike Copeland type methods, which penalize entities with fewer observed comparisons by assigning them fewer wins, our pairwise probability model adjusts for incompleteness, eliminating bias toward items with more complete records. The resulting rankings reflect underlying performance rather than data availability, enhancing fairness, transparency, and statistical reliability in high-stakes applications.

翻译：从教育到在线平台等领域的决策都离不开排序，然而诸如博达计数法或科普兰类两两比较方法等经典确定性方法，却忽略了因抽样噪声或数据不完整所导致的不确定性。我们提出了一种概率框架，将真实排序视为潜在随机变量，从而能够量化排序的不确定性。我们基于两两占优概率引入了新的排序标准，推导出近似推断程序，并提供了一种新颖的"最劣最佳"排序方法，用于构建排名的同时置信区间和个体置信区间。我们的方法是首个为经典确定性排序提供正式不确定性量化的方法。该方法内在地对缺失数据具有鲁棒性：与科普兰类方法（通过为比较观察较少的实体分配较少胜场而对其施以惩罚）不同，我们的两两概率模型会针对数据不完整性进行调整，消除偏向记录更完整实体的偏差。由此产生的排序反映的是潜在表现，而非数据可用性，从而在高风险应用中增强了公平性、透明度和统计可靠性。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【博士论文】深度序列模型中的概率学习与生成机制

专知会员服务

13+阅读 · 3月3日

【斯坦福博士论文】概率机器学习中的不确定性原理

专知会员服务

27+阅读 · 2025年8月4日

《基于信念的决策建模计算框架》141页

专知会员服务

71+阅读 · 2024年4月27日