Statistical inference in parametric models (e.g., the Bradley--Terry model and its variants) for paired-comparison data has been explored in the high-dimensional regime, in which the number of items involving in paired comparisons diverges. However, parametric models are highly susceptible to model misspecification. To relax the assumption of known distributions and provide flexibility, we propose a semiparametric framework for modeling the merits of items and covariate effects (e.g., home-field advantage) by introducing latent random variables with unspecified distributions. As the number of parameters increases with the number of items, semiparametric inference is highly nontrivial. To address this issue, we employ a kernel-based least squares approach to estimate all unknown parameters. When each pair of items has a fixed number of comparisons and the number of items tends to infinity, we prove the consistency of all resulting estimators and derive their asymptotic normal distributions. To the best of our knowledge, this is the first study to conduct a semiparametric analysis of paired comparisons with an increasing dimension. We conduct simulations to evaluate the finite-sample performance of the proposed method and illustrate its practical utility by analyzing an NBA dataset.
翻译:在高维情形下,参数模型(例如Bradley-Terry模型及其变体)对配对比较数据的统计推断已得到研究,其中参与配对比较的项目数量趋于发散。然而,参数模型极易受到模型误设的影响。为放宽已知分布的假设并提供灵活性,我们通过引入分布未指定的潜在随机变量,提出一个用于建模项目优劣和协变量效应(例如主场优势)的半参数框架。由于参数数量随项目数量增加,半参数推断具有高度非平凡性。为解决此问题,我们采用基于核的最小二乘方法来估计所有未知参数。当每对项目具有固定比较次数且项目数量趋于无穷时,我们证明了所有所得估计量的一致性,并推导了其渐近正态分布。据我们所知,这是首项关于维度递增的配对比较半参数分析的研究。我们通过模拟评估了所提方法的有限样本性能,并通过分析NBA数据集展示了其实用价值。