Matching has been widely used to mimic a randomized experiment with observational data. Ideally, treated subjects are exactly matched with controls for the covariates, and randomization-based estimation can then be conducted as in a randomized experiment (assuming no unobserved covariates). However, when there exists continuous covariates or many covariates, matching typically should be inexact. Previous studies have routinely ignored inexact matching in the downstream randomization-based estimation as long as some covariate balance criteria are satisfied, which can cause severe estimation bias. Built on the covariate-adaptive randomization inference framework, in this research note, we propose two new classes of bias-corrected randomization-based estimators to reduce estimation bias due to inexact matching: the bias-corrected maximum $p$-value estimator for the constant treatment effect and the bias-corrected difference-in-means estimator for the average treatment effect. Our simulation results show that the proposed bias-corrected estimators can effectively reduce estimation bias due to inexact matching.
翻译:匹配已被广泛用于使用观察性数据模拟随机化实验。理想情况下,处理组受试者应与控制组在协变量上精确匹配,随后可像在随机化实验中一样(假设无未观测协变量)进行基于随机化的估计。然而,当存在连续协变量或大量协变量时,匹配通常不可能精确。既往研究在后续基于随机化的估计中常忽略不精确匹配,只要满足某些协变量平衡准则即可,这可能导致严重的估计偏差。基于协变量自适应随机化推断框架,本研究笔记提出两类新的偏差校正的基于随机化估计器,用于减少因不精确匹配导致的估计偏差:针对常数处理效应的偏差校正最大$p$值估计量,以及针对平均处理效应的偏差校正均值差估计量。模拟结果表明,所提出的偏差校正估计量能有效减少因不精确匹配导致的估计偏差。