Recently, there has been great interest in estimating the conditional average treatment effect using flexible machine learning methods. However, in practice, investigators often have working hypotheses about effect heterogeneity across pre-defined subgroups of study units, which we call the groupwise approach. The paper compares two modern ways to estimate groupwise treatment effects, a nonparametric approach and a semiparametric approach, with the goal of better informing practice. Specifically, we compare (a) the underlying assumptions, (b) efficiency and adaption to the underlying data generating models, and (c) a way to combine the two approaches. We also discuss how to test a key assumption concerning the semiparametric estimator and to obtain cluster-robust standard errors if study units in the same subgroups are correlated. We demonstrate our findings by conducting simulation studies and reanalyzing the Early Childhood Longitudinal Study.
翻译:近期,利用灵活机器学习方法估计条件平均处理效应引起了广泛关注。然而,在实际研究中,研究者通常对研究单元预先定义的分组之间存在效应异质性持有工作假设,我们称之为分组方法。本文比较了两种现代分组处理效应估计方法——非参数方法和半参数方法,旨在为实践提供更好指导。具体而言,我们比较了:(a) 基本假设;(b) 对潜在数据生成模型的效率与适应性;以及 (c) 结合两种方法的途径。此外,我们还探讨了如何检验半参数估计量的关键假设,以及当同组研究单元存在相关性时如何获取聚类稳健标准误。通过模拟研究及对早期儿童纵向研究数据的重新分析,我们展示了研究结果。