A widely used formulation for null hypotheses in the analysis of multivariate $d$-dimensional data is $\mathcal{H}_0: \boldsymbol{H} \boldsymbol{\theta} =\boldsymbol{y}$ with $\boldsymbol{H}$ $\in\mathbb{R}^{m\times d}$, $\boldsymbol{\theta}$ $\in \mathbb{R}^d$ and $\boldsymbol{y}\in\mathbb{R}^m$, where $m\leq d$. Here the unknown parameter vector $\boldsymbol{\theta}$ can, for example, be the expectation vector $\boldsymbol{\mu}$, a vector $\boldsymbol{\beta} $ containing regression coefficients or a quantile vector $\boldsymbol{q}$. Also, the vector of nonparametric relative effects $\boldsymbol{p}$ or an upper triangular vectorized covariance matrix $\textbf{v}$ are useful choices. However, even without multiplying the hypothesis with a scalar $\gamma\neq 0$, there is a multitude of possibilities to formulate the same null hypothesis with different hypothesis matrices $\boldsymbol{H}$ and corresponding vectors $\boldsymbol{y}$. Although it is a well-known fact that in case of $\boldsymbol{y}=\boldsymbol{0}$ there exists a unique projection matrix $\boldsymbol{P}$ with $\boldsymbol{H}\boldsymbol{\theta}=\boldsymbol{0}\Leftrightarrow \boldsymbol{P}\boldsymbol{\theta}=\boldsymbol{0}$, for $\boldsymbol{y}\neq \boldsymbol{0}$ such a projection matrix does not necessarily exist. Moreover, since such hypotheses are often investigated using a quadratic form as the test statistic, the corresponding projection matrices often contain zero rows; so, they are not even effective from a computational aspect. In this manuscript, we show that for the Wald-type-statistic (WTS), which is one of the most frequently used quadratic forms, the choice of the concrete hypothesis matrix does not affect the test decision. Moreover, some simulations are conducted to investigate the possible influence of the hypothesis matrix on the computation time.
翻译:对于多元$d$维数据分析中零假设的常见表述为$\mathcal{H}_0: \boldsymbol{H} \boldsymbol{\theta} =\boldsymbol{y}$,其中$\boldsymbol{H} \in\mathbb{R}^{m\times d}$,$\boldsymbol{\theta} \in \mathbb{R}^d$,$\boldsymbol{y}\in\mathbb{R}^m$,且$m\leq d$。此处未知参数向量$\boldsymbol{\theta}$可以是期望向量$\boldsymbol{\mu}$、包含回归系数的向量$\boldsymbol{\beta}$或分位数向量$\boldsymbol{q}$。此外,非参数相对效应向量$\boldsymbol{p}$或上三角向量化协方差矩阵$\textbf{v}$也是有用的选择。然而,即使不考虑将假设乘以标量$\gamma\neq 0$,使用不同的假设矩阵$\boldsymbol{H}$及其对应的向量$\boldsymbol{y}$,仍有多种方式可表述同一零假设。尽管已知当$\boldsymbol{y}=\boldsymbol{0}$时存在唯一投影矩阵$\boldsymbol{P}$,使得$\boldsymbol{H}\boldsymbol{\theta}=\boldsymbol{0}\Leftrightarrow \boldsymbol{P}\boldsymbol{\theta}=\boldsymbol{0}$,但对于$\boldsymbol{y}\neq \boldsymbol{0}$情况,这样的投影矩阵未必存在。此外,由于此类假设常通过二次型作为检验统计量进行研究,相应的投影矩阵中常包含零行,这使其从计算角度来看效率低下。本文证明,对于最常用的二次型之一——Wald型统计量(WTS),具体假设矩阵的选择不会影响检验决策。同时,通过模拟实验探究假设矩阵对计算时间的可能影响。