Causal inference in observational studies can be challenging when confounders are subject to missingness. Generally, the identification of causal effects is not guaranteed even under restrictive parametric model assumptions when confounders are missing not at random. To address this, We propose a general framework to establish the identification of causal effects when confounders are subject to treatment-independent missingness, which means that the missing data mechanism is independent of the treatment, given the outcome and possibly missing confounders. We give special consideration to commonly-used models for continuous and binary outcomes and provide counterexamples when identification fails. For estimation, we provide a weighted estimation equation estimating method for model parameters and purpose three estimators for the average causal effect based on the estimated models. We evaluate the finite-sample performance of the estimators via simulations. We further illustrate the proposed method with real data sets from the National Health and Nutrition Examination Survey.
翻译:在观察性研究中,当混杂因素存在缺失时,因果推断面临挑战。一般而言,即使采用严格的参数模型假设,当混杂因素非随机缺失时,因果效应的识别仍无法保证。为解决这一问题,我们提出一个通用框架,在混杂因素符合治疗无关缺失(即给定结局和可能缺失的混杂因素后,缺失数据机制与治疗无关)的条件下,建立因果效应的可识别性。我们特别考虑了连续型和二元结局的常用模型,并给出了识别失效的反例。在估计方面,我们提出基于加权估计方程的参数估计方法,并基于估计模型构建了三种平均因果效应的估计量。通过模拟实验评估了估计量在有限样本下的性能,并利用美国国家健康与营养调查的真实数据进一步验证了所提方法。