This position paper argues that, in debiased machine learning, balancing functions should be derived from the Neyman orthogonal score, not chosen only as functions of covariates. Covariate balancing is effective when the regression error entering the score can be represented by functions of covariates alone, and it is the natural finite-dimensional approximation for targets such as ATT counterfactual means. For ATE estimation under treatment effect heterogeneity, however, the score error generally contains treatment-specific components because the outcome regression is a function of the full regressor $X=(D,Z)$. In that case, balancing common functions of $Z$ can leave the treatment-specific component unbalanced. We therefore advocate regressor balancing, implemented by Riesz regression with basis functions of $X$, as the general balancing principle for DML. The position is not that covariate balancing is invalid, but that covariate balancing should be understood as the special case that is appropriate when the score-relevant regression error is a function of covariates alone.
翻译:本立场论文主张,在去偏机器学习中,平衡函数应源自内曼正交得分,而非仅作为协变量的函数选择。当进入得分的回归误差可单独由协变量函数表示时,协变量平衡是有效的,且对于ATT反事实均值等目标而言,这是自然的有限维近似。然而,在处理效应异质性下的ATE估计中,由于结果回归是全回归量$X=(D,Z)$的函数,得分误差通常包含处理特异性成分。此时,平衡$Z$的公共函数可能使处理特异性成分未被平衡。因此,我们倡导回归量平衡(通过基于$X$的基函数进行Riesz回归实现)作为DML的通用平衡原则。本文立场并非否认协变量平衡的有效性,而是主张应将协变量平衡理解为当得分相关回归误差仅为协变量函数时才适用的特例。