Longitudinal data analysis is fundamental for understanding dynamic processes in biomedical and social sciences. Although varying coefficient models (VCMs) provide a flexible framework by allowing covariate effects to evolve over time, fitting all effects as time-varying may lead to overfitting, efficiency loss, and reduced interpretability when some effects are actually constant. In contrast, standard linear mixed models (LMMs) may suffer substantial bias when temporal heterogeneity is ignored. To address this issue, we propose time-varying effect selection, TV-Select, a unified framework for structural identification that simultaneously selects relevant variables and determines whether their effects are constant or time-varying. The proposed method decomposes each coefficient function into a time-invariant mean component and a centered time-varying deviation, where the latter is approximated by B-splines. We then construct a doubly penalized objective function that combines a group Lasso penalty for structural sparsity with a roughness penalty for smoothness control. An efficient block coordinate descent algorithm is developed for computation. Under regular semiparametric conditions, we establish selection consistency and oracle-type asymptotic properties, including asymptotic normality for the constant-effect component after correct structure recovery. Simulation studies and a real-data application show that TV-Select achieves more accurate structural recovery, smoother functional estimation, and better predictive performance than competing methods.
翻译:纵向数据分析对于理解生物医学和社会科学中的动态过程至关重要。尽管变系数模型通过允许协变量效应随时间演变提供了灵活的框架,但当某些效应实际为常数时,将所有效应拟合为时变可能导致过拟合、效率损失和可解释性降低。相反,当忽略时间异质性时,标准线性混合模型可能产生显著偏差。为解决这一问题,我们提出时变效应选择方法TV-Select——一个用于结构识别的统一框架,可同时选择相关变量并判定其效应是常数还是时变的。该方法将每个系数函数分解为时不变均值分量和中心化的时变偏差,后者通过B样条进行逼近。我们进而构建了双重惩罚目标函数,结合用于结构稀疏性的组Lasso惩罚和用于平滑度控制的粗糙度惩罚。开发了高效的块坐标下降算法进行计算。在常规半参数条件下,我们建立了选择一致性及Oracle型渐近性质,包括正确结构恢复后常数效应分量的渐近正态性。模拟研究和实际数据应用表明,与现有方法相比,TV-Select能实现更精确的结构恢复、更平滑的函数估计以及更好的预测性能。