A century ago, Neyman showed how to evaluate the efficacy of treatment using a randomized experiment under a minimal set of assumptions. This classical repeated sampling framework serves as a basis of routine experimental analyses conducted by today's scientists across disciplines. In this paper, we demonstrate that Neyman's methodology can also be used to experimentally evaluate the efficacy of individualized treatment rules (ITRs), which are derived by modern causal machine learning algorithms. In particular, we show how to account for additional uncertainty resulting from a training process based on cross-fitting. The primary advantage of Neyman's approach is that it can be applied to any ITR regardless of the properties of machine learning algorithms that are used to derive the ITR. We also show, somewhat surprisingly, that for certain metrics, it is more efficient to conduct this ex-post experimental evaluation of an ITR than to conduct an ex-ante experimental evaluation that randomly assigns some units to the ITR. Our analysis demonstrates that Neyman's repeated sampling framework is as relevant for causal inference today as it has been since its inception.
翻译:一个世纪前,内曼证明如何在最小假设条件下利用随机化实验评估治疗效果。这一经典重复抽样框架至今仍是各学科科学家常规实验分析的基础。本文论证内曼方法论同样可用于实验评估由现代因果机器学习算法推导的个体化治疗规则(ITR)的效果。具体而言,我们展示了如何核算基于交叉拟合的训练过程所产生的额外不确定性。内曼方法的主要优势在于其普适性——无论用于推导ITR的机器学习算法特性如何,均可对其进行评估。令人意外的是,我们发现对于某些特定指标,对ITR进行事后实验评估的效率甚至高于将部分受试者随机分配至ITR的事前实验评估。我们的分析表明,内曼的重复抽样框架自创立以来,至今仍与因果推断研究息息相关。