The growing demand for personalized decision-making has led to a surge of interest in estimating the Conditional Average Treatment Effect (CATE). Various types of CATE estimators have been developed with advancements in machine learning and causal inference. However, selecting the desirable CATE estimator through a conventional model validation procedure remains impractical due to the absence of counterfactual outcomes in observational data. Existing approaches for CATE estimator selection, such as plug-in and pseudo-outcome metrics, face two challenges. First, they must determine the metric form and the underlying machine learning models for fitting nuisance parameters (e.g., outcome function, propensity function, and plug-in learner). Second, they lack a specific focus on selecting a robust CATE estimator. To address these challenges, this paper introduces a Distributionally Robust Metric (DRM) for CATE estimator selection. The proposed DRM is nuisance-free, eliminating the need to fit models for nuisance parameters, and it effectively prioritizes the selection of a distributionally robust CATE estimator. The experimental results validate the effectiveness of the DRM method in selecting CATE estimators that are robust to the distribution shift incurred by covariate shift and hidden confounders.
翻译:个性化决策需求的日益增长引发了人们对估计条件平均处理效应(CATE)的浓厚兴趣。随着机器学习和因果推断的进步,各类CATE估计器相继被开发出来。然而,由于观测数据中反事实结果的缺失,通过传统的模型验证程序来选择理想的CATE估计器仍然不切实际。现有的CATE估计器选择方法,如插件法和伪结果度量法,面临两个挑战。首先,它们必须确定度量形式以及用于拟合干扰参数(例如结果函数、倾向性函数和插件学习器)的底层机器学习模型。其次,它们缺乏对选择稳健CATE估计器的特别关注。为应对这些挑战,本文提出了一种用于CATE估计器选择的分布稳健度量(DRM)。所提出的DRM无需处理干扰参数,省去了为干扰参数拟合模型的步骤,并能有效优先选择分布稳健的CATE估计器。实验结果验证了DRM方法在选择对协变量偏移和隐藏混杂因素引起的分布偏移具有稳健性的CATE估计器方面的有效性。