This paper investigates the rank aggregation problem through the lens of multi-way comparison data derived from rater scores. Departing from traditional parametric frameworks, such as the Bradley-Terry and Plackett-Luce models, we propose a model-free method that accommodates highly heterogeneous preference distributions across raters and encompasses weak stochastic transitivity in pairwise comparisons as a special case. We establish the theoretical foundations of the proposed estimator by proving its consistency, demonstrating that the proportion of discordant pairs (Kendall tau) converges to zero in probability as the number of raters diverges. Furthermore, we derive upper and lower bounds for a performance metric based on Kendall's tau. In certain asymptotic regimes, these bounds coincide up to logarithmic factors, so the estimator is nearly minimax optimal. These results are obtained by analyzing the convergence behavior of a U-empirical process; the novel technical results developed for this analysis may be of independent theoretical interest. The practical utility of our method is validated through extensive simulations and applications to sports player rankings and survey preference aggregation.
翻译:本文通过评估者评分产生的多方比较数据,研究了排名聚合问题。不同于传统参数框架(如Bradley-Terry模型和Plackett-Luce模型),我们提出一种无模型方法,该方法能够适应评分者之间高度异质的偏好分布,并将成对比较中的弱随机传递性作为特例纳入考虑。我们通过证明所提估计量的一致性,奠定了其理论基础:当评分者数量趋于无穷时,不一致对(Kendall tau)的比例依概率收敛于零。进一步地,我们推导了基于Kendall tau性能指标的上下界。在某些渐近条件下,这些界限在相差对数因子的范围内一致,因此该估计量近乎达到极小化最优。上述结果通过分析U-经验过程的收敛行为获得;为此分析而发展的新型技术成果可能具有独立的理论价值。通过大规模仿真实验及在运动员排名与调查偏好聚合中的应用,我们验证了该方法的实际效用。