This paper studies the identification, estimation, and inference of long-term (binary) treatment effect parameters when balanced panel data is not available, or consists of only a subset of the available data. We develop a new estimator: the chained difference-in-differences, which leverages the overlapping structure of many unbalanced panel data sets. This approach consists in aggregating a collection of short-term treatment effects estimated on multiple incomplete panels. Our estimator accommodates (1) multiple time periods, (2) variation in treatment timing, (3) treatment effect heterogeneity, (4) general missing data patterns, and (5) sample selection on observables. We establish the asymptotic properties of the proposed estimator and discuss identification and efficiency gains in comparison to existing methods. Finally, we illustrate its relevance through (i) numerical simulations, and (ii) an application about the effects of an innovation policy in France.
翻译:本文研究了在平衡面板数据不可得或仅包含部分可用数据时,长期(二值)处理效应参数的识别、估计与推断问题。我们提出了一种新的估计量——链式双重差分法,该方法利用了许多非平衡面板数据集中的重叠结构。这一方法通过对多个不完整面板上估计得到的短期处理效应进行聚合来实现。我们的估计量兼容:(1)多期时间序列、(2)处理时间异质性、(3)处理效应异质性、(4)一般缺失数据模式,以及(5)基于可观测变量的样本选择。我们建立了所提出估计量的渐近性质,并讨论其相较于现有方法在识别与效率提升方面的优势。最后,通过(可选项)数值模拟与(可选项)法国一项创新政策效应的应用案例,展示了该方法的实用性。