Longitudinal data often involve heterogeneity, sparse signals, and contamination from response outliers or high-leverage observations especially in biomedical science. Existing methods usually address only part of this problem, either emphasizing penalized mixed effects modeling without robustness or robust mixed effects estimation without high-dimensional variable selection. We propose a doubly adaptive robust regression (DAR-R) framework for longitudinal linear mixed effects models. It combines a robust pilot fit, doubly adaptive observation weights for residual outliers and leverage points, and folded concave penalization for fixed effect selection, together with weighted updates of random effects and variance components. We develop an iterative reweighting algorithm and establish estimation and prediction error bounds, support recovery consistency, and oracle-type asymptotic normality. Simulations show that DAR-R improves estimation accuracy, false-positive control, and covariance estimation under both vertical outliers and bad leverage contamination. In the TADPOLE/ADNI Alzheimer's disease application, DAR-R achieves accurate and stable prediction of ADAS13 while selecting clinically meaningful predictors with strong resampling stability.
翻译:纵向数据常涉及异质性、稀疏信号以及响应异常值或高杠杆观测点的污染,这在生物医学领域尤为突出。现有方法通常仅解决部分问题:要么侧重无稳健性的惩罚混合效应建模,要么侧重无高维变量选择的稳健混合效应估计。本文针对纵向线性混合效应模型提出双重自适应稳健回归(DAR-R)框架。该框架融合了稳健初始拟合、针对残差异常值与杠杆点的双重自适应观测权重、用于固定效应选择的折叠凹惩罚,以及随机效应与方差分量的加权更新。我们开发了迭代重加权算法,建立了估计与预测误差界、支撑恢复一致性及类oracle渐近正态性。仿真实验表明,在垂直异常值和不良杠杆污染下,DAR-R能提升估计精度、假阳性控制能力和协方差估计效果。在TADPOLE/ADNI阿尔茨海默病应用中,DAR-R实现了对ADAS13的准确稳定预测,同时筛选出具有临床意义且重采样稳定性强的预测因子。