Minimax rates for heterogeneous causal effect estimation

Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference, playing a crucial role in optimal treatment allocation, generalizability, subgroup effects, and more. Many flexible methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a nonparametric model where distributional components are Holder-smooth, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. More specifically, our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods; it is shown to be minimax optimal under a condition on how accurately the covariate distribution is estimated. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid. We conclude with some discussion of a few remaining open problems.

翻译：异质性因果效应估计——即政策与治疗效果在不同个体间的差异——是因果推断中的基本任务，在最优处理分配、泛化性、亚组效应等方面发挥关键作用。近年来虽有多种灵活方法被提出用于估计条件平均处理效应（CATE），但关于最优性的问题仍悬而未决。特别是，极小极大最优性理论尚未建立，收敛的极小极大速率及速率最优估计量的构造仍是开放性问题。本文在分布成分满足赫尔德光滑性的非参数模型中推导了CATE估计的极小极大速率，并提出一种新的局部多项式估计量，给出其达到极小极大最优的高层条件。具体而言，我们的极小极大下界通过模糊假设方法的局部化版本推导，结合了非参数回归与函数估计的下界构造。所提估计量可视为基于高阶影响函数方法局部化改进的局部多项式R-学习器，在协变量分布估计精度满足特定条件时被证明是极小极大最优的。我们发现的极小极大速率展现出若干有趣特征，包括非标准手肘现象以及非参数回归与函数估计速率之间的异常插值，后者量化了CATE作为估计对象可被视为回归/函数混合体的特性。最后，我们讨论了一些尚待解决的开放问题。