In the analysis of binary longitudinal data, it is of interest to model a dynamic relationship between a response and covariates as a function of time, while also investigating similar patterns of time-dependent interactions. We present a novel generalized varying-coefficient model that accounts for within-subject variability and simultaneously clusters varying-coefficient functions, without restricting the number of clusters nor overfitting the data. In the analysis of a heterogeneous series of binary data, the model extracts population-level fixed effects, cluster-level varying effects, and subject-level random effects. Various simulation studies show the validity and utility of the proposed method to correctly specify cluster-specific varying-coefficients when the number of clusters is unknown. The proposed method is applied to a heterogeneous series of binary data in the German Socioeconomic Panel (GSOEP) study, where we identify three major clusters demonstrating the different varying effects of socioeconomic predictors as a function of age on the working status.
翻译:在二分类纵向数据分析中,研究响应变量与协变量随时间变化的动态关系,同时探索时间依赖交互作用的相似模式具有重要意义。我们提出了一种新颖的广义变系数模型,该模型不仅考虑了受试者内部变异性,还能同时实现变系数函数的聚类,且无需限制聚类数量或导致数据过拟合。在异质性二分类序列数据分析中,该模型可提取总体水平固定效应、聚类水平变系数效应以及受试者水平随机效应。多项模拟研究验证了所提方法在聚类数量未知时正确指定聚类特异性变系数的有效性和实用性。该方法被应用于德国社会经济面板(GSOEP)研究中异质性二分类序列数据,我们识别出三个主要聚类,揭示了社会经济预测因子随年龄变化对工作状态的不同变系数效应。