In epidemiology and social sciences, propensity score methods are popular for estimating treatment effects using observational data, and multiple imputation is popular for handling covariate missingness. However, how to appropriately use multiple imputation for propensity score analysis is not completely clear. This paper aims to bring clarity on the consistency (or lack thereof) of methods that have been proposed, focusing on the within approach (where the effect is estimated separately in each imputed dataset and then the multiple estimates are combined) and the across approach (where typically propensity scores are averaged across imputed datasets before being used for effect estimation). We show that the within method is valid and can be used with any causal effect estimator that is consistent in the full-data setting. Existing across methods are inconsistent, but a different across method that averages the inverse probability weights across imputed datasets is consistent for propensity score weighting. We also comment on methods that rely on imputing a function of the missing covariate rather than the covariate itself, including imputation of the propensity score and of the probability weight. Based on consistency results and practical flexibility, we recommend generally using the standard within method. Throughout, we provide intuition to make the results meaningful to the broad audience of applied researchers.
翻译:在流行病学和社会科学领域,倾向性评分方法广泛用于利用观测数据估计处理效应,而多重插补则常用于处理协变量缺失问题。然而,如何恰当结合多重插补与倾向性评分分析尚不明确。本文旨在澄清现有方法的一致性(或不一致性),重点探讨组内方法(在每个插补数据集中单独估计效应,再合并多个估计值)和跨组方法(通常先对插补数据集中的倾向性评分取均值,再用于效应估计)。我们发现组内方法有效,且可与任何在全数据条件下一致的因果效应估计量结合使用。现有跨组方法存在不一致性,但另一种跨组方法——通过对插补数据集中的逆概率权重取均值——在倾向性评分加权法中具有一致性。此外,我们评述了依赖插补缺失协变量函数(而非协变量本身)的方法,包括倾向性评分和概率权重的插补。基于一致性结果和实际灵活性,我们建议普遍采用标准组内方法。全文提供直观解释,使结果对应用研究者群体具有实际意义。