Estimating the conditional average treatment effects (CATE) is very important in causal inference and has a wide range of applications across many fields. In the estimation process of CATE, the unconfoundedness assumption is typically required to ensure the identifiability of the regression problems. When estimating CATE using high-dimensional data, there have been many variable selection methods and neural network approaches based on representation learning, while these methods do not provide a way to verify whether the subset of variables after dimensionality reduction or the learned representations still satisfy the unconfoundedness assumption during the estimation process, which can lead to ineffective estimates of the treatment effects. Additionally, these methods typically use data from only the treatment or control group when estimating the regression functions for each group. This paper proposes a novel neural network approach named \textbf{CrossNet} to learn a sufficient representation for the features, based on which we then estimate the CATE, where cross indicates that in estimating the regression functions, we used data from their own group as well as cross-utilized data from another group. Numerical simulations and empirical results demonstrate that our method outperforms the competitive approaches.
翻译:条件平均处理效应(CATE)的估计在因果推断中至关重要,并在众多领域具有广泛应用。在CATE的估计过程中,通常需要满足无混淆假设以确保回归问题的可识别性。当使用高维数据估计CATE时,已有许多基于变量选择方法和表征学习的神经网络方法,但这些方法并未提供验证降维后的变量子集或学习到的表征在估计过程中是否仍满足无混淆假设的途径,这可能导致处理效应估计失效。此外,这些方法在估计各组的回归函数时通常仅使用处理组或对照组的数据。本文提出了一种名为\textbf{CrossNet}的新型神经网络方法,用于学习特征的充分表征,并基于该表征估计CATE,其中"交叉"意味着在估计回归函数时,我们不仅使用了本组数据,还交叉利用了另一组的数据。数值模拟与实证结果表明,我们的方法优于现有竞争性方法。