State-of-the-art methods for conditional average treatment effect (CATE) estimation make widespread use of representation learning. Here, the idea is to reduce the variance of the low-sample CATE estimation by a (potentially constrained) low-dimensional representation. However, low-dimensional representations can lose information about the observed confounders and thus lead to bias, because of which the validity of representation learning for CATE estimation is typically violated. In this paper, we propose a new, representation-agnostic refutation framework for estimating bounds on the representation-induced confounding bias that comes from dimensionality reduction (or other constraints on the representations) in CATE estimation. First, we establish theoretically under which conditions CATE is non-identifiable given low-dimensional (constrained) representations. Second, as our remedy, we propose a neural refutation framework which performs partial identification of CATE or, equivalently, aims at estimating lower and upper bounds of the representation-induced confounding bias. We demonstrate the effectiveness of our bounds in a series of experiments. In sum, our refutation framework is of direct relevance in practice where the validity of CATE estimation is of importance.
翻译:当前最先进的条件下平均处理效应(CATE)估计方法广泛采用表示学习技术。其核心思想是通过(可能受约束的)低维表示来降低小样本CATE估计的方差。然而,低维表示可能导致观测混杂因子信息丢失,从而引入偏倚,这通常违背了将表示学习用于CATE估计的有效性前提。本文提出一种新的、与表示方法无关的反驳框架,用于估计CATE估计中因降维(或表示的其他约束)导致的表示诱导混杂偏倚的边界。首先,我们从理论上建立低维(约束)表示在何种条件下会导致CATE不可识别。其次,作为解决方案,我们提出一种神经反驳框架,该框架对CATE进行部分识别,或等价地致力于估计表示诱导混杂偏倚的上下界。我们通过一系列实验验证了所提出边界的有效性。总之,我们的反驳框架对于实践中重视CATE估计有效性的场景具有直接应用价值。