Covariate balancing estimation and model selection for difference-in-differences approach

Remarkable progress has been made in difference-in-differences (DID) approaches to causal inference that estimate the average effect of a treatment on the treated (ATT). Of these, the semiparametric DID (SDID) approach incorporates a propensity score analysis into the DID setup. Supposing that the ATT is a function of covariates, we estimate it by weighting the inverse of the propensity score. In this study, as one way to make the estimation robust to the propensity score modeling, we incorporate covariate balancing. Then, by attentively constructing the moment conditions used in the covariate balancing, we show that the proposed estimator is doubly robust. In addition to the estimation, we also address model selection. In practice, covariate selection is an essential task in statistical analysis, but even in the basic setting of the SDID approach, there are no reasonable information criteria. Here, we derive a model selection criterion as an asymptotically bias-corrected estimator of risk based on the loss function used in the SDID estimation. We show that a penalty term can be derived that is considerably different from almost twice the number of parameters that often appears in AIC-type information criteria. Numerical experiments show that the proposed method estimates the ATT more robustly compared with the method using propensity scores given by maximum likelihood estimation, and that the proposed criterion clearly reduces the risk targeted in the SDID approach in comparison with the intuitive generalization of the existing information criterion. In addition, real data analysis confirms that there is a large difference between the results of the proposed method and those of the existing method.

翻译：在用于估计处理对受处理者平均效应（ATT）的差分法（DID）因果推断研究中，已取得显著进展。其中，半参数差分法（SDID）将倾向得分分析纳入DID框架。假设ATT是协变量的函数，我们通过加权倾向得分的倒数来估计它。在本研究中，为使估计对倾向得分建模具有稳健性，我们引入了协变量平衡。随后，通过精心构建用于协变量平衡的矩条件，我们证明了所提出的估计量具有双重稳健性。除估计外，我们还探讨了模型选择问题。在实践中，协变量选择是统计分析中的关键任务，但即使在SDID方法的基本设定下，目前也缺乏合理的信息准则。在此，我们基于SDID估计中使用的损失函数，推导出一个作为风险渐近偏差校正估计量的模型选择准则。研究表明，其惩罚项可能与AIC类信息准则中常见的近似两倍参数数量存在显著差异。数值实验表明，与使用最大似然估计给出的倾向得分方法相比，所提方法能更稳健地估计ATT；并且与现有信息准则的直观推广相比，所提准则明显降低了SDID方法所针对的风险。此外，实际数据分析证实，所提方法与现有方法的结果存在较大差异。