Accurate estimation of conditional average treatment effects (CATE) is at the core of personalized decision making. While there is a plethora of models for CATE estimation, model selection is a nontrivial task, due to the fundamental problem of causal inference. Recent empirical work provides evidence in favor of proxy loss metrics with double robust properties and in favor of model ensembling. However, theoretical understanding is lacking. Direct application of prior theoretical work leads to suboptimal oracle model selection rates due to the non-convexity of the model selection problem. We provide regret rates for the major existing CATE ensembling approaches and propose a new CATE model ensembling approach based on Q-aggregation using the doubly robust loss. Our main result shows that causal Q-aggregation achieves statistically optimal oracle model selection regret rates of $\frac{\log(M)}{n}$ (with $M$ models and $n$ samples), with the addition of higher-order estimation error terms related to products of errors in the nuisance functions. Crucially, our regret rate does not require that any of the candidate CATE models be close to the truth. We validate our new method on many semi-synthetic datasets and also provide extensions of our work to CATE model selection with instrumental variables and unobserved confounding.
翻译:条件平均干预效应(CATE)的精确估计是个性化决策的核心。尽管存在大量用于CATE估计的模型,但由于因果推断的基本问题,模型选择是一项非平凡的任务。最近的实证工作支持具有双稳健特性的代理损失度量以及模型集成方法。然而,理论理解尚显不足。由于模型选择问题的非凸性,直接应用先前的理论工作会导致次优的预言模型选择率。我们针对现有主要CATE集成方法给出了遗憾率,并提出了一种基于双稳健损失且采用Q-聚合的CATE模型集成新方法。我们的主要结果表明,因果Q-聚合能够实现统计最优的预言模型选择遗憾率 $\frac{\log(M)}{n}$(其中 $M$ 为模型数量,$n$ 为样本量),同时还包含与干扰函数误差乘积相关的高阶估计误差项。关键在于,我们的遗憾率不要求任何候选CATE模型逼近真实值。我们在多个半合成数据集上验证了新方法,并将工作扩展至工具变量与未观测混杂因素下的CATE模型选择。