Causal Inference with MNAR Self-Masking Confounders: A Stratified Delta-Imputed Propensity Estimation Method

In observational studies, causal inference becomes difficult when confounders are missing-not-at-random (MNAR), particularly where the missingness depends on the confounder's own unreported value (self-masking). Existing methods for handling MNAR confounders often rely on strong, unverifiable assumptions, leading to biased estimates. We propose a simple approach with Stratified Delta-Imputed Propensity Estimator (SDIPE) in the presence of self-masking confounders. SDIPE first stratifies data into observed and missing groups, imputes missing confounders via delta-adjusted multiple imputation. Then, within each group, average-treatment-effects (ATEs) are estimated by stabilized-inverse-probability-weights. The final ATE is obtained by combining the subgroup-specific estimates, weighted by respective proportions in the sample. Simulation study shows that SDIPE achieves low bias and near-nominal coverage (94-96%) across varying missingness, sample sizes, and treatment prevalence. In contrast, conventional sensitivity-based multiple imputation exhibits substantial bias and poor coverage (18-89%). Additionally, SDIPE is robust to the choice of the delta parameter. Applied to NHANES-2017-2018, SDIPE estimates that married individuals have a 1.19-point lower depression score than unmarried individuals (95% CI: -1.76, -0.64), adjusting for MNAR income data. SDIPE provides a practical and robust approach for causal inference with self-masking MNAR confounders, offering improved performance over existing methods without requiring restrictive assumptions about the missingness mechanism.

翻译：在观察性研究中，当混杂因素存在非随机缺失（MNAR），尤其是缺失机制依赖于混杂因素自身未报告的值（自掩蔽）时，因果推断变得困难。处理MNAR混杂因素的现有方法通常依赖于强且无法验证的假设，导致估计结果存在偏差。针对存在自掩蔽混杂因素的情况，我们提出了一种采用分层Delta插补倾向性估计器（SDIPE）的简单方法。SDIPE首先将数据分层为观测组和缺失组，通过Delta调整的多重插补法对缺失的混杂因素进行插补。随后，在每个组内，通过稳定化逆概率加权法估计平均处理效应（ATEs）。最终的ATE通过合并各亚组特异性估计值获得，并以样本中各自的比例进行加权。模拟研究表明，在不同的缺失率、样本量和处理流行率下，SDIPE均实现了较低的偏差和接近名义水平的覆盖率（94-96%）。相比之下，传统的基于敏感性的多重插补法则表现出显著的偏差和较差的覆盖率（18-89%）。此外，SDIPE对Delta参数的选择具有鲁棒性。应用于NHANES-2017-2018数据，在调整了MNAR收入数据后，SDIPE估计已婚个体的抑郁评分比未婚个体低1.19分（95% CI: -1.76, -0.64）。SDIPE为处理自掩蔽MNAR混杂因素的因果推断提供了一种实用且稳健的方法，相较于现有方法，它在不要求对缺失机制施加限制性假设的情况下，提供了更优的性能。