In the linear mixed model (LMM), the simultaneous assessment and comparison of dispersion relevance of explanatory variables associated with fixed and random effects remains an important open practical problem. Based on the restricted maximum likelihood equations in the variance components form of the LMM, we prove a proper decomposition of the sum of squares of the dependent variable into unbiased estimators of interpretable estimands of explained variation. This result leads to a natural extension of the well-known adjusted coefficient of determination to the LMM. Further, we allocate the novel unbiased estimators of explained variation to specific contributions of covariates associated with fixed and random effects within a single model fit. These parameter-wise explained variations constitute easily interpretable quantities, assessing dispersion relevance of covariates associated with both fixed and random effects on a common scale, thus allowing for a covariate ranking. For illustration, we contrast the variation explained by subjects and time in the longitudinal sleep deprivation study. By comparing the dispersion relevance of population characteristics and spatial levels, we determine literacy as a major driver of income inequality in Burkina Faso. Finally, we develop a novel relevance plot to visualize the dispersion relevance of high-dimensional genomic markers in Arabidopsis thaliana.
翻译:在线性混合模型(LMM)中,同时评估并比较与固定效应和随机效应相关的解释变量的离散相关性,仍是一个重要的开放性实践问题。基于方差分量形式下LMM的限制最大似然方程,我们证明了对因变量平方和进行适当分解,可得到解释变异量可解释估计量的无偏估计。该结果自然地将著名的调整判定系数扩展至LMM。进一步,我们在单次模型拟合中,将新的解释变异无偏估计量分配给与固定效应和随机效应关联的协变量的具体贡献。这些按参数划分的解释变异量构成了易于解释的度量,可在统一尺度上评估与固定效应和随机效应均相关的协变量的离散相关性,从而支持协变量排序。为进行说明,我们对比了纵向睡眠剥夺研究中受试者与时间解释的变异量。通过比较群体特征与空间层次的离散相关性,我们确定识字率为布基纳法索收入不平等的主要驱动因素。最后,我们开发了一种新型关联性图,用于可视化拟南芥中高维基因组标记的离散相关性。