An inner-product Hilbert space formulation of the Kemeny distance is defined over the domain of all permutations with ties upon the extended real line, and results in an unbiased minimum variance (Gauss-Markov) correlation estimator upon a homogeneous i.i.d. sample. In this work, we construct and prove the necessary requirements to extend this linear topology for both Spearman's \(\rho\) and Kendall's \(\tau_{b}\), showing both spaces to be both biased and inefficient upon practical data domains. A probability distribution is defined for the Kemeny \(\tau_{\kappa}\) estimator, and a Studentisation adjustment for finite samples is provided as well. This work allows for a general purpose linear model duality to be identified as a unique consistent solution to many biased and unbiased estimation scenarios.
翻译:在扩展实数线上所有含并列的排列域上定义了Kemeny距离的内积希尔伯特空间表述,并在同质独立同分布样本上产生无偏最小方差(高斯-马尔可夫)相关估计量。本研究构建并证明了将此线性拓扑扩展到Spearman \(\rho\)和Kendall \(\tau_{b}\)的必要条件,表明这两个空间在实际数据域上均存在偏差且效率低下。为Kemeny \(\tau_{\kappa}\)估计量定义了概率分布,并提出了有限样本的学生化调整方法。本研究使通用线性模型对偶性能够被识别为许多有偏和无偏估计场景中唯一一致的解。