In studies of educational production functions or intergenerational mobility, it is common to transform the key variables into percentile ranks. Yet, it remains unclear what the regression coefficient estimates with ranks of the outcome or the treatment. In this paper, we derive effective causal estimands for a broad class of commonly-used regression methods, including the ordinary least squares (OLS), two-stage least squares (2SLS), difference-in-differences (DiD), and regression discontinuity designs (RDD). Specifically, we introduce a novel primitive causal estimand, the Rank Average Treatment Effect (rank-ATE), and prove that it serves as the building block of the effective estimands of all the aforementioned econometrics methods. For 2SLS, DiD, and RDD, we show that direct applications to outcome ranks identify parameters that are difficult to interpret. To address this issue, we develop alternative methods to identify more interpretable causal parameters.
翻译:在教育生产函数或代际流动性的研究中,将关键变量转换为百分位排序是常见做法。然而,对于使用结果变量或处理变量的排序进行回归所得的系数估计,其因果含义仍不明确。本文推导了一类广泛使用的回归方法(包括普通最小二乘法、两阶段最小二乘法、双重差分法以及断点回归设计)的有效因果估计量。具体而言,我们引入了一个新的基础因果估计量——排序平均处理效应,并证明它是上述所有计量经济学方法有效估计量的构成基础。对于两阶段最小二乘法、双重差分法和断点回归设计,我们证明直接应用于结果排序所识别的参数难以解释。为解决此问题,我们开发了替代方法来识别更具可解释性的因果参数。