Evaluating heterogeneity of treatment effects (HTE) across subgroups is common in both randomized trials and observational studies. Although several statistical challenges of HTE analyses including low statistical power and multiple comparisons are widely acknowledged, issues arising for clustered data, including cluster randomized trials (CRTs), have received less attention. Notably, the potential for model misspecification is increased given the complex clustering structure (e.g., due to correlation among individuals within a subgroup and cluster), which could impact inference and type 1 errors. To illicit this issue, we conducted a simulation study to evaluate the performance of common analytic approaches for testing the presence of HTE for continuous, binary, and count outcomes: generalized linear mixed models (GLMM) and generalized estimating equations (GEE) including interaction terms between treatment group and subgroup. We found that standard GLMM analyses that assume a common correlation of participants within clusters can lead to severely elevated type 1 error rates of up to 47.2% compared to the 5% nominal level if the within-cluster correlation varies across subgroups. A flexible GLMM, which allows subgroup-specific within-cluster correlations, achieved the nominal type 1 error rate, as did GEE (though rates were slightly elevated even with as many as 50 clusters). Applying the methods to a real-world CRT using the count outcome utilization of healthcare, we found a large impact of the model specification on inference: the standard GLMM yielded highly significant interaction by sex (P=0.01), whereas the interaction was non-statistically significant under the flexible GLMM and GEE (P=0.64 and 0.93, respectively). We recommend that HTE analyses using GLMM account for within-subgroup correlation to avoid anti-conservative inference.
翻译:评估不同亚组间治疗效果异质性在随机试验和观察性研究中均属常见。尽管HTE分析面临的若干统计挑战(如统计功效不足和多重比较问题)已获广泛认知,但针对聚类数据(包括群组随机试验)所产生的问题尚未得到充分关注。值得注意的是,鉴于复杂的聚类结构(例如源于亚组内和群组内个体间的相关性),模型设定错误的可能性随之增加,这可能影响统计推断和Ⅰ类错误。为阐明此问题,我们通过模拟研究评估了检验连续型、二分类及计数结局变量HTE的常用分析方法性能:包括治疗组与亚组交互项的广义线性混合模型和广义估计方程。研究发现,若群组内相关性随亚组变化,假定群组内参与者具有共同相关性的标准GLMM分析可能导致Ⅰ类错误率严重升高(较5%名义水平最高达47.2%)。允许亚组特异性群组内相关性的灵活GLMM达到了名义Ⅰ类错误率,GEE方法亦如此(尽管在多达50个群组时错误率仍轻微偏高)。将这些方法应用于使用医疗保健利用计数结局的真实世界CRT时,我们发现模型设定对推断产生重大影响:标准GLMM显示性别交互作用高度显著(P=0.01),而灵活GLMM与GEE下的交互作用均无统计学意义(P值分别为0.64和0.93)。我们建议使用GLMM进行HTE分析时应考虑亚组内相关性,以避免产生反保守推断。