This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup. Specifically, the comparison graph consists of hyper-edges of possible heterogeneous sizes, and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in real applications, circumventing the need to specify the graph randomness and the restrictive homogeneous sampling assumption imposed in the commonly used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models. Furthermore, in scenarios where the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE). We discover that a two-step spectral method, where we apply the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Given the asymptotic distributions of the estimated preference scores, we also introduce a comprehensive framework to carry out both one-sample and two-sample ranking inferences, applicable to both fixed and random graph settings. It is noteworthy that this is the first time effective two-sample rank testing methods have been proposed. Finally, we substantiate our findings via comprehensive numerical simulations and subsequently apply our developed methodologies to perform statistical inferences for statistical journals and movie rankings.
翻译:本文研究了在更一般且更现实的设定下,谱方法在估计与不确定性量化中用于比较实体未观测偏好得分的性能。具体而言,比较图由可能具有异质大小的超边构成,且给定超边的比较次数可低至一次。这种设定在真实应用中普遍存在,避免了指定图随机性以及常用 Bradley-Terry-Luce (BTL) 或 Plackett-Luce (PL) 模型中施加的限制性同质采样假设。此外,在 BTL 或 PL 模型适用的情况下,我们揭示了谱估计量与最大似然估计量 (MLE) 之间的关系。我们发现,两步谱方法(即利用等权重朴素谱方法估计的最优权重)能够达到与 MLE 相同的渐近效率。基于估计偏好得分的渐近分布,我们还引入了一个综合框架,用于执行单样本和双样本排序推断,该框架适用于固定图和随机图设定。值得注意的是,这是首次提出有效的双样本排序检验方法。最后,我们通过全面的数值模拟验证了我们的发现,并应用所开发的方法对统计期刊和电影排名进行统计推断。