There is rising interest in using Machine Learning (ML) model predictions as outcomes in causal analysis. However, these methods have faced challenges in finding the true treatment effects. It is also challenging to make choices about which prediction models to choose, since we are interested not only in the accuracy of the prediction but in its ability to produce the correct causal effect in the analysis. In this paper I propose a decomposition of the prediction into between-unit prediction ($η_μ$), within-unit-across-time prediction ($η_ε$), and counterfactual-treatment-effect prediction ($η_T$). I show that the counterfactual-treatment-effect component is the one that determines whether the model recovers the true treatment effect, but only the first two components can be estimated from non-experimental data. I argue that within-unit-across-time prediction accuracy ($η_ε$) is a structurally better proxy for the counterfactual-treatment-effect component ($η_T$) than overall prediction accuracy, and propose a metric to estimate it from panel data with at least two time periods. This metric serves as a diagnostic and model-selection tool for choosing ML models for causal analysis. Under the stronger assumption that $η_T \approx η_ε$, it also enables constructing an approximately unbiased estimate of the treatment effect. I develop the theoretical framework and illustrate it with simulations of synthetic data.
翻译:机器学习(ML)模型预测作为因果分析中的结果变量正引起越来越多的关注。然而,这些方法在寻找真实处理效应方面面临挑战。由于我们不仅关注预测的准确性,更关注其在分析中产生正确因果效应的能力,因此选择合适的预测模型也颇具挑战性。本文提出将预测分解为单元间预测(η_μ)、单元内跨时间预测(η_ε)以及反事实处理效应预测(η_T)。我证明反事实处理效应分量是决定模型能否恢复真实处理效应的关键,但仅有前两个分量可从非实验数据中估计。我提出单元内跨时间预测精度(η_ε)在结构上比总体预测精度更优地作为反事实处理效应分量(η_T)的代理指标,并基于至少两个时间周期的面板数据设计出相应的估计指标。该指标可作为为因果分析选择ML模型的诊断与模型选择工具。在η_T ≈ η_ε这一更强假设下,它还能构建近似无偏的处理效应估计量。我建立了理论框架并通过合成数据模拟予以验证。