Neyman (1923/1990) introduced the randomization model, which contains the notation of potential outcomes to define causal effects and a framework for large-sample inference based on the design of the experiment. However, the existing theory for this framework is far from complete especially when the number of treatment levels diverges and the treatment group sizes vary. We provide a unified discussion of statistical inference under the randomization model with general treatment group sizes. We formulate the estimator in terms of a linear permutational statistic and use results based on Stein's method to derive various Berry--Esseen bounds on the linear and quadratic functions of the estimator. These new Berry--Esseen bounds serve as basis for design-based causal inference with possibly diverging treatment levels and a diverging number of causal parameters of interest. We also fill an important gap by proposing novel variance estimators for experiments with possibly many treatment levels without replications. Equipped with the newly developed results, design-based causal inference in general settings becomes more convenient with stronger theoretical guarantees.
翻译:摘要:Neyman(1923/1990)提出了随机化模型,该模型包含潜在结果概念以定义因果效应,并提供了基于实验设计的大样本推断框架。然而,现有理论在此框架下仍远未完善,尤其在处理水平发散且处理组规模存在差异时。我们针对一般化处理组规模的随机化模型,提供了统计推断的统一论述。通过将估计量表述为线性置换统计量,并利用基于Stein方法的结果,推导出估计量线性与二次函数的多种Berry-Esseen界。这些新导出的Berry-Esseen界为处理水平可能发散、因果参数数量可能增长的设计型因果推断奠定了理论基础。此外,我们针对可能含有多个处理水平但无重复实验的情形,提出了新型方差估计量,填补了该领域的重要空白。借助这些新成果,一般场景下的设计型因果推断将更便捷且具备更强的理论保障。