Quantifying the heterogeneity is an important issue in meta-analysis, and among the existing measures, the $I^2$ statistic is the most commonly used measure in the literature. In this paper, we show that the $I^2$ statistic was, in fact, defined as problematic or even completely wrong from the very beginning. To confirm this statement, we first present a motivating example to show that the $I^2$ statistic is heavily dependent on the study sample sizes, and consequently it may yield contradictory results for the amount of heterogeneity. Moreover, by drawing a connection between ANOVA and meta-analysis, the $I^2$ statistic is shown to have, mistakenly, applied the sampling errors of the estimators rather than the variances of the study populations. Inspired by this, we introduce an Intrinsic measure for Quantifying the heterogeneity in meta-analysis, and meanwhile study its statistical properties to clarify why it is superior to the existing measures. We further propose an optimal estimator, referred to as the IQ statistic, for the new measure of heterogeneity that can be readily applied in meta-analysis. Simulations and real data analysis demonstrate that the IQ statistic provides a nearly unbiased estimate of the true heterogeneity and it is also independent of the study sample sizes.
翻译:量化异质性是荟萃分析中的一个重要问题,在现有度量中,$I^2$统计量是文献中最常用的指标。本文表明,$I^2$统计量实际上从一开始就存在定义问题,甚至可能是完全错误的。为证实这一论断,我们首先通过一个激励性示例展示$I^2$统计量严重依赖于研究样本量,这可能导致其对异质性程度得出矛盾的结果。此外,通过建立方差分析(ANOVA)与荟萃分析之间的联系,我们揭示了$I^2$统计量错误地应用了估计量的抽样误差而非研究总体的方差。受此启发,我们引入了用于量化荟萃分析异质性的内在度量(Intrinsic measure),同时研究其统计性质以阐明其优于现有度量的原因。我们进一步提出了一种最优估计量,称为IQ统计量,用于这一新的异质性度量,可便捷地应用于荟萃分析。模拟实验和实际数据分析表明,IQ统计量能提供近乎无偏的真实异质性估计,且独立于研究样本量。