Matching has been widely used to mimic a randomized experiment with observational data. Ideally, treated subjects are exactly matched with controls for the covariates, and randomization-based estimation can then be conducted as in a randomized experiment (assuming no unobserved covariates). However, when there exists continuous covariates or many covariates, matching typically should be inexact. Previous studies have routinely ignored inexact matching in the downstream randomization-based estimation as long as some covariate balance criteria are satisfied, which can cause severe estimation bias. Built on the covariate-adaptive randomization inference framework, in this research note, we propose two new classes of bias-corrected randomization-based estimators to reduce estimation bias due to inexact matching: the bias-corrected maximum $p$-value estimator for the constant treatment effect and the bias-corrected difference-in-means estimator for the average treatment effect. Our simulation results show that the proposed bias-corrected estimators can effectively reduce estimation bias due to inexact matching.
翻译:匹配方法被广泛应用于利用观测数据模拟随机化实验。理想情况下,处理组个体与对照组在协变量上实现精确匹配,随后可像随机化实验一样进行基于随机化的估计(假设不存在未观测协变量)。然而,当存在连续协变量或大量协变量时,匹配通常无法精确。以往研究在后续基于随机化的估计中常忽略不精确匹配问题,只要满足某些协变量均衡标准即可,这可能导致严重的估计偏差。基于协变量自适应随机化推断框架,本研究提出两类新的偏差校正随机化估计量以降低由不精确匹配引起的估计偏差:用于恒定处理效应的偏差校正最大$p$值估计量与用于平均处理效应的偏差校正均值差异估计量。模拟结果表明,所提出的偏差校正估计量能有效降低由不精确匹配引起的估计偏差。