Economists often estimate treatment effects in experiments using remotely sensed variables (RSVs), e.g., satellite images or mobile phone activity, in place of directly measured economic outcomes. A common practice is to use an observational sample to train a predictor of the economic outcome from the RSV, and then use these predictions as the outcomes in the experiment. We show that this method is biased whenever the RSV is a post-outcome variable, meaning that variation in the economic outcome causes variation in the RSV. For example, changes in poverty or environmental quality cause changes in satellite images, but not vice versa. As our main result, we nonparametrically identify the treatment effect by formalizing the intuition underlying common practice: the conditional distribution of the RSV given the outcome and treatment is stable across samples. Our identifying formula reveals that efficient inference requires predictions of three quantities from the RSV -- the outcome, treatment, and sample indicator -- whereas common practice only predicts the outcome. Valid inference does not require any rate conditions on RSV predictions, justifying the use of complex deep learning algorithms with unknown statistical properties. We reanalyze the effect of an anti-poverty program in India using satellite images.
翻译:经济学家在实验中常使用遥感变量(如卫星图像或手机活动数据)替代直接测量的经济结果来估计处理效应。常见做法是利用观测样本训练一个从遥感变量预测经济结果的模型,随后将这些预测值作为实验中的结果变量。我们证明,当遥感变量为后置结果变量时(即经济结果的变化导致遥感变量的变化),该方法会产生偏差。例如,贫困或环境质量的变化会引起卫星图像的变化,反之则不然。作为主要结果,我们通过形式化常见做法背后的直觉——在给定结果和处理条件下,遥感变量的条件分布在样本间保持稳定——从而非参数地识别了处理效应。我们的识别公式表明,高效推断需要从遥感变量预测三个量:结果、处理和样本指示变量,而常见做法仅预测结果。有效的推断不要求遥感变量预测满足任何速率条件,这为使用统计特性未知的复杂深度学习算法提供了依据。我们利用卫星图像重新分析了印度一项扶贫计划的效果。