Testing the equality of mean vectors across $g$ different groups plays an important role in many scientific fields. In regular frameworks, likelihood-based statistics under the normality assumption offer a general solution to this task. However, the accuracy of standard asymptotic results is not reliable when the dimension $p$ of the data is large relative to the sample size $n_i$ of each group. We propose here an exact directional test for the equality of $g$ normal mean vectors with identical unknown covariance matrix in a high dimensional setting, provided that $\sum_{i=1}^g n_i \ge p+g+1$. In the case of two groups ($g=2$), the directional test coincides with the Hotelling's $T^2$ test. In the more general situation where the $g$ independent groups may have different unknown covariance matrices, although exactness does not hold, simulation studies show that the directional test is more accurate than most commonly used likelihood{-}based solutions, at least in a moderate dimensional setting in which $p=O(n_i^\tau)$, $\tau \in (0,1)$. Robustness of the directional approach and its competitors under deviation from the assumption of multivariate normality is also numerically investigated. Our proposal is here applied to data on blood characteristics of male athletes and to microarray data storing gene expressions in patients with breast tumors.
翻译:检验$g$个不同组别间均值向量的相等性在许多科学领域中具有重要意义。在常规框架下,基于正态性假设的似然统计量为该任务提供了通用解决方案。然而,当数据维度$p$相对于各组样本量$n_i$较大时,标准渐近结果的准确性并不可靠。本文提出一种高维场景下检验$g$个具有相同未知协方差矩阵的正态均值向量相等性的精确方向性检验方法,其适用条件为$\sum_{i=1}^g n_i \ge p+g+1$。当存在两组时($g=2$),该方向性检验与Hotelling's $T^2$检验等价。在更普遍的情况下,若$g$个独立组可能具有不同的未知协方差矩阵,尽管无法保证精确性,模拟研究表明方向性检验比大多数常用的基于似然的解决方案更为准确,至少在中维场景($p=O(n_i^\tau)$, $\tau \in (0,1)$)中如此。本文还通过数值模拟研究了方向性方法及其竞争方案在偏离多元正态性假设下的稳健性。所提出的方法已应用于男性运动员血液特征数据及存储乳腺癌患者基因表达的微阵列数据。