This paper considers the problem of inference in cluster randomized experiments when cluster sizes are non-ignorable. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the cluster level. By non-ignorable cluster sizes, we refer to the possibility that the individual-level average treatment effects may depend non-trivially on the cluster sizes. We frame our analysis in a super-population framework in which cluster sizes are random. In this way, our analysis departs from earlier analyses of cluster randomized experiments in which cluster sizes are treated as non-random. We distinguish between two different parameters of interest: the equally-weighted cluster-level average treatment effect, and the size-weighted cluster-level average treatment effect. For each parameter, we provide methods for inference in an asymptotic framework where the number of clusters tends to infinity and treatment is assigned using a covariate-adaptive stratified randomization procedure. We additionally permit the experimenter to sample only a subset of the units within each cluster rather than the entire cluster and demonstrate the implications of such sampling for some commonly used estimators. A small simulation study and empirical demonstration show the practical relevance of our theoretical results.
翻译:本文考虑集群规模非可忽略时集群随机实验的推断问题。此处,集群随机实验指在集群层面分配处理的实验设计。非可忽略集群规模指个体层面的平均处理效应可能显著依赖于集群规模的可能性。我们基于集群规模随机的超总体框架展开分析,该框架与将集群规模视为非随机的早期集群随机实验分析方法存在本质区别。我们区分两类关注参数:等权集群层面平均处理效应与规模加权集群层面平均处理效应。针对每个参数,我们提出在渐近框架(集群数量趋于无穷大、采用协变量自适应分层随机化程序分配处理)下的推断方法。此外,我们允许实验者仅抽取各集群的部分单元而非整个集群,并阐明此类抽样对若干常用估计量的影响。小型模拟研究与实证分析表明我们理论结果的实践相关性。