This paper considers the problem of inference in cluster randomized experiments when cluster sizes are non-ignorable. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by non-ignorable cluster sizes we mean that the distribution of potential outcomes, and the treatment effects in particular, may depend non-trivially on the cluster sizes. In order to permit this sort of flexibility, we consider a sampling framework in which cluster sizes themselves are random. In this way, our analysis departs from earlier analyses of cluster randomized experiments in which cluster sizes are treated as non-random. We distinguish between two different parameters of interest: the equally-weighted cluster-level average treatment effect, and the size-weighted cluster-level average treatment effect. For each parameter, we provide methods for inference in an asymptotic framework where the number of clusters tends to infinity and treatment is assigned using a covariate-adaptive stratified randomization procedure. We additionally permit the experimenter to sample only a subset of the units within each cluster rather than the entire cluster and demonstrate the implications of such sampling for some commonly used estimators. A small simulation study and empirical demonstration show the practical relevance of our theoretical results.
翻译:本文考虑了在整群随机实验中,当整群规模不可忽略时的推断问题。在此,所谓整群随机实验,是指处理分配在整群层面进行;而不可忽略的整群规模则意味着潜在结果的分布,特别是处理效应,可能非平凡地依赖于整群规模。为允许这种灵活性,我们采用了一种将整群规模本身视为随机的抽样框架。由此,我们的分析与早期将整群规模视为非随机的整群随机实验分析有所不同。我们区分了两个不同的目标参数:等权重整群级平均处理效应,以及规模加权整群级平均处理效应。针对每个参数,我们在一个渐近框架下提供了推断方法,该框架中整群数量趋于无穷大,且处理分配采用协变量自适应分层随机化程序。此外,我们允许实验者仅从每个整群中抽取部分单元而非整个整群,并论证了这种抽样对某些常用估计量的影响。一项小型模拟研究和实证演示展示了我们理论结果的实际相关性。