Minimax rates for heterogeneous causal effect estimation

from arxiv, This update resolves the minimax rates regardless of whether propensity scores or regression functions are more smooth - and now in two models, depending on whether the control or marginal regressions are assumed smooth. Some typos and errors have also been fixed

Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference, playing a crucial role in optimal treatment allocation, generalizability, subgroup effects, and more. Many flexible methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a nonparametric model where distributional components are Holder-smooth, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. More specifically, our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods; it is shown to be minimax optimal under a condition on how accurately the covariate distribution is estimated. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid. We conclude with some discussion of a few remaining open problems.

翻译：异质性因果效应估计（即政策与处理效应如何随个体变化）是因果推断中的基本任务，在最优处理分配、泛化性、亚组效应等中起着关键作用。近年来虽已提出许多灵活的条件平均处理效应（CATE）估计方法，但关于最优性的问题仍悬而未决。具体而言，极小极大最优性理论尚待发展，其收敛速率及速率最优估计量的构造仍属开放性问题。本文在分布成分具有赫尔德光滑性的非参数模型中推导了CATE估计的极小极大速率，并提出新的局部多项式估计量，给出了其达到极小极大最优性的高阶条件。更为具体地，我们通过模糊假设方法的局部化版本推导极小极大下界，融合了非参数回归与函数估计的下界构造技术。本文提出的估计量可视为基于高阶影响函数方法的局部化改进的局部多项式R-学习器，在满足协变量分布估计精度的条件下被证明是极小极大最优的。所得到的极小极大速率展现出若干有趣特征，包括非标准拐点现象以及非参数回归与函数估计速率间的非寻常插值，后者量化了CATE作为估计目标可被视为回归/函数混合体的特性。最后，我们讨论了若干仍待解决的开放性问题。