Linear mixed models (LMMs) are suitable for clustered data and are common in biometrics, medicine, survey statistics and many other fields. In those applications it is essential to carry out a valid inference after selecting a subset of the available variables. We construct confidence sets for the fixed effects in Gaussian LMMs that are based on Lasso-type estimators. Aside from providing confidence regions, this also allows to quantify the joint uncertainty of both variable selection and parameter estimation in the procedure. To show that the resulting confidence sets for the fixed effects are uniformly valid over the parameter spaces of both the regression coefficients and the covariance parameters, we also prove the novel result on uniform Cramer consistency of the restricted maximum likelihood (REML) estimators of the covariance parameters. The superiority of the constructed confidence sets to naive post-selection procedures is validated in simulations and illustrated with a study of the acid neutralization capacity of lakes in the United States.
翻译:线性混合模型适用于聚类数据,在生物计量学、医学、调查统计学及众多其他领域应用广泛。在这些应用中,选择可用变量子集后进行有效推断至关重要。我们基于Lasso类估计量,为高斯线性混合模型中的固定效应构建了置信集。该置信集不仅提供了置信区域,还允许量化过程中变量选择与参数估计的联合不确定性。为证明所构建的固定效应置信集在回归系数与协方差参数空间上具有一致有效性,我们还证明了协方差参数限制最大似然估计量的一致Cramer一致性这一新结论。通过模拟实验及对美国湖泊酸中和能力的研究,验证了所构建置信集优于朴素事后选择方法。