In cluster randomized experiments, units are often recruited after the random cluster assignment, and data are only available for the recruited sample. Post-randomization recruitment can lead to selection bias, inducing systematic differences between the overall and the recruited populations, and between the recruited intervention and control arms. In this setting, we define causal estimands for the overall and the recruited populations. We first show that if units select their cluster independently of the treatment assignment, cluster randomization implies individual randomization in the overall population. We then prove that under the assumption of ignorable recruitment, the average treatment effect on the recruited population can be consistently estimated from the recruited sample using inverse probability weighting. Generally we cannot identify the average treatment effect on the overall population. Nonetheless, we show, via a principal stratification formulation, that one can use weighting of the recruited sample to identify treatment effects on two meaningful subpopulations of the overall population: units who would be recruited into the study regardless of the assignment, and units who would be recruited in the study under treatment but not under control. We develop a corresponding estimation strategy and a sensitivity analysis method for checking the ignorable recruitment assumption.
翻译:在整群随机实验中,研究对象往往在随机整群分配后招募,且仅能获得被招募样本的数据。随机化后招募可能导致选择偏倚,引发总体人群与被招募人群之间、以及干预组与对照组被招募人群之间的系统性差异。在此背景下,我们定义了总体人群与被招募人群的因果估计目标。首先证明:若研究对象独立于处理分配选择所属整群,则整群随机化意味着总体人群中的个体随机化。进一步论证:在可忽略招募假设下,可通过逆概率加权法从被招募样本中一致估计被招募人群的平均处理效应。通常我们无法识别总体人群的平均处理效应,但通过主分层框架表明:对被招募样本进行加权,可以识别总体人群中两个有意义子群的处理效应——即无论分配何种处理均会被招募的研究对象,以及仅在处理组而非对照组会被招募的研究对象。我们开发了相应的估计策略及检验可忽略招募假设的敏感性分析方法。