Bergsma (2006) proposed a covariance $\kappa$(X,Y) between random variables X and Y. He derived their asymptotic distributions under the null hypothesis of independence between X and Y. The non-null (dependent) case does not seem to have been studied in the literature. We derive several alternate expressions for $\kappa$. One of them leads us to a very intuitive estimator of $\kappa$(X,Y) that is a nice function of four naturally arising U-statistics. We derive the exact finite sample relation between all three estimates. The asymptotic distribution of our estimator, and hence also of the other two estimators, in the non-null (dependence) case, is then obtained by using the U-statistics central limit theorem. For specific parametric bivariate distributions, the value of $\kappa$ can be derived in terms of the natural dependence parameters of these distributions. In particular, we derive the formula for $\kappa$ when (X,Y) are distributed as Gumbel's bivariate exponential. We bring out various aspects of these estimators through extensive simulations from several prominent bivariate distributions. In particular, we investigate the empirical relationship between $\kappa$ and the dependence parameters, the distributional properties of the estimators, and the accuracy of these estimators. We also investigate the powers of these measures for testing independence, compare these among themselves, and with other well known such measures. Based on these exercises, the proposed estimator seems as good or better than its competitors both in terms of power and computing efficiency.
翻译:Bergsma(2006)提出了随机变量X和Y之间的协方差κ(X,Y),并推导了其在X与Y独立原假设下的渐近分布。文献中尚未研究非独立(依赖)情形。我们推导了κ的若干替代表达式,其中一种表达式引出了一个非常直观的κ(X,Y)估计量,该估计量是四个自然出现的U-统计量的优良函数。我们精确推导了三种估计量在有限样本下的关系。通过使用U-统计量中心极限定理,我们得到了估计量(进而也包括其他两个估计量)在非独立(依赖)情形下的渐近分布。对于特定的参数双变量分布,κ的值可以用这些分布的自然依赖参数表示。特别地,我们推导了当(X,Y)服从Gumbel双变量指数分布时κ的公式。通过对若干典型双变量分布进行大量模拟,我们揭示了这些估计量的多方面特性,尤其研究了κ与依赖参数之间的经验关系、估计量的分布性质及其准确性。我们还比较了这些度量在检验独立性时的功效,包括相互之间的对比,以及与其他著名度量的比较。基于这些实验,所提出的估计量在功效和计算效率方面均表现出与竞争方法相当或更优的性能。