Matching is a widely used causal inference design that aims to approximate a randomized experiment using observational data by forming matched sets of treated and control units based on similarities in their covariates. Ideally, treated units are exactly matched with controls on these covariates, enabling randomization-based inference for treatment effects as in a randomized experiment, under the assumption of no unobserved covariates. However, inexact matching often occurs, leading to residual covariate imbalance after matching. Previous matched studies have typically overlooked this issue and relied on conventional randomization-based inference, assuming that some covariate balance criteria are met. Recent research, however, has shown that this approach can introduce significant bias and proposed methods to correct for bias arising from inexact matching in randomization-based inference. These methods, however, are primarily focused on the constant treatment effect and its extensions (i.e., Fisher's sharp null) and do not apply to average treatment effects (i.e., Neyman's weak null). To address this gap, we introduce a new method--inverse post-matching probability weighting--for conducting randomization-based inference for average treatment effects under inexact matching. Our theoretical and simulation results indicate that, compared to conventional randomization-based inference methods, our approach significantly reduces bias and improves coverage rates in the presence of inexact matching.
翻译:[翻译摘要] 匹配是一种广泛使用的因果推断设计,旨在通过基于处理组与对照组单位协变量的相似性形成匹配集,利用观察性数据近似随机化实验的理想效果。理想情况下,处理组单位与对照组单位在这些协变量上实现精确匹配,从而在假设不存在未观测协变量的条件下,能够像随机化实验一样对处理效应进行基于随机化的推断。然而,不完全匹配现象时有发生,导致匹配后存在残留的协变量不平衡问题。既往匹配研究通常忽视这一缺陷,默认满足某些协变量平衡标准而依赖常规的随机化推断方法。但最新研究表明,此类方法可能引入显著偏差,并提出了在随机化推断框架下校正因不完全匹配产生偏差的修正方法。然而,这些方法主要聚焦于常值处理效应及其扩展情形(即费希尔尖锐原假设),不适用于平均处理效应(即内曼弱原假设)。为弥补这一理论空白,我们提出一种新方法——逆匹配后概率加权法——用于在不完全匹配条件下对平均处理效应进行随机化推断。理论推导与模拟实验表明,相较于传统随机化推断方法,本方法在存在不完全匹配时能显著降低偏差并有效提高覆盖率。