Linear mixed models (LMMs) are suitable for clustered data and are common in biometrics, medicine, survey statistics and many other fields. In those applications, it is essential to carry out valid inference after selecting a subset of the available variables. We construct confidence sets for the fixed effects in Gaussian LMMs that are based on Lasso-type estimators. Aside from providing confidence regions, this also allows to quantify the joint uncertainty of both variable selection and parameter estimation in the procedure. To show that the resulting confidence sets for the fixed effects are uniformly valid over the parameter spaces of both the regression coefficients and the covariance parameters, we also prove the novel result on uniform Cramer consistency of the restricted maximum likelihood (REML) estimators of the covariance parameters. The superiority of the constructed confidence sets to naive post-selection procedures is validated in simulations and illustrated with a study of the acid neutralization capacity of lakes in the United States.
翻译:线性混合模型(LMMs)适用于聚类数据,在生物计量学、医学、调查统计学及众多其他领域应用广泛。在这些应用中,在选定可用变量子集后进行有效推断至关重要。我们构建了基于Lasso型估计量的高斯LMM固定效应置信集。除提供置信区域外,该方法还能量化过程中变量选择与参数估计的联合不确定性。为证明所构建的固定效应置信集在回归系数与协方差参数参数空间上的一致有效性,我们还证明了协方差参数约束极大似然(REML)估计量的一致Cramer相容性的新结果。通过模拟验证及对美国湖泊酸中和能力的研究实例,证明了所构建置信集优于朴素的事后选择方法。