(To appear in The American Statistician.) Distance covariance (Sz\'ekely, Rizzo, and Bakirov, 2007) is a fascinating recent notion, which is popular as a test for dependence of any type between random variables $X$ and $Y$. This approach deserves to be touched upon in modern courses on mathematical statistics. It makes use of distances of the type $|X-X'|$ and $|Y-Y'|$, where $(X',Y')$ is an independent copy of $(X,Y)$. This raises natural questions about independence of variables like $X-X'$ and $Y-Y'$, about the connection between Cov$(|X-X'|,|Y-Y'|)$ and the covariance between doubly centered distances, and about necessary and sufficient conditions for independence. We show some basic results and present a new and nontechnical counterexample to a common fallacy, which provides more insight. We also show some motivating examples involving bivariate distributions and contingency tables, which can be used as didactic material for introducing distance correlation.
翻译:(即将发表于《美国统计学家》)距离协方差(Székely, Rizzo, and Bakirov, 2007)是一个引人入胜的新近概念,作为检验随机变量$X$与$Y$之间任意类型依赖关系的工具而广受欢迎。这一方法值得在现代数理统计课程中予以探讨。它利用了$|X-X'|$和$|Y-Y'|$类型的距离,其中$(X',Y')$是$(X,Y)$的独立副本。这自然引出了关于$X-X'$与$Y-Y'$等变量独立性、Cov$(|X-X'|,|Y-Y'|)$与双重中心化距离协方差之间的关联,以及关于独立性的充分必要条件的若干问题。我们展示了一些基本结果,并针对一个常见谬误提出了一个新的非技术性反例,从而提供了更深入的理解。我们还展示了一些涉及二元分布和列联表的启发性示例,这些材料可作为引入距离相关性的教学素材。