Informative cluster size (ICS) arises in situations with clustered data where a latent relationship exists between the number of participants in a cluster and the outcome measures. Although this phenomenon has been sporadically reported in statistical literature for nearly two decades now, further exploration is needed in certain statistical methodologies to avoid potentially misleading inferences. For inference about population quantities without covariates, inverse cluster size reweightings are often employed to adjust for ICS. Further, to study the effect of covariates on disease progression described by a multistate model, the pseudo-value regression technique has gained popularity in time-to-event data analysis. We seek to answer the question: "How to apply pseudo-value regression to clustered time-to-event data when cluster size is informative?" ICS adjustment by the reweighting method can be performed in two steps; estimation of marginal functions of the multistate model and fitting the estimating equations based on pseudo-value responses, leading to four possible strategies. We present theoretical arguments and thorough simulation experiments to ascertain the correct strategy for adjusting for ICS. A further extension of our methodology is implemented to include informativeness induced by the intra-cluster group size. We demonstrate the methods in two real-world applications: (i) to determine predictors of tooth survival in a periodontal study, and (ii) to identify indicators of ambulatory recovery in spinal cord injury patients who participated in locomotor-training rehabilitation.
翻译:信息性簇大小(ICS)出现在聚类数据中,即簇内参与者数量与结局测量之间存在潜在关系的情况。尽管这一现象在近二十年的统计学文献中偶有报道,但在某些统计方法学中仍需进一步探索,以避免可能产生误导性推断。在无协变量时,为推断总体参数常采用逆簇大小重加权方法来调整ICS。此外,为研究多状态模型描述的疾病进展中协变量的影响,伪值回归技术在时间事件数据分析中日益流行。我们旨在回答:"当簇大小具有信息性时,如何将伪值回归应用于聚类时间事件数据?"通过重加权方法进行ICS调整可分为两步:估计多状态模型的边际函数,以及基于伪值响应拟合估计方程,由此产生四种策略。我们通过理论论证和全面仿真实验确定调整ICS的正确策略。进一步扩展方法以纳入簇内组大小所诱导的信息性。我们在两个实际应用中演示该方法:(i) 确定牙周研究中牙齿存活的预测因子,(ii) 识别参与运动训练康复的脊髓损伤患者行走功能恢复的指标。