Missing data in multiple variables is a common issue. We investigate the applicability of the framework of graphical models for handling missing data to a complex longitudinal pharmacological study of HIV-positive children treated with an efavirenz-based regimen as part of the CHAPAS-3 trial. Specifically, we examine whether the causal effects of interest, defined through static interventions on multiple continuous variables, can be recovered (estimated consistently) from the available data only. So far, no general algorithms are available to decide on recoverability, and decisions have to be made on a case-by-case basis. We emphasize sensitivity of recoverability to even the smallest changes in the graph structure, and present recoverability results for three plausible missingness directed acyclic graphs (m-DAGs) in the CHAPAS-3 study, informed by clinical knowledge. Furthermore, we propose the concept of ''closed missingness mechanisms'' and show that under these mechanisms an available case analysis is admissible for consistent estimation for any type of statistical and causal query, even if the underlying missingness mechanism is of missing not at random (MNAR) type. Both simulations and theoretical considerations demonstrate how, in the assumed MNAR setting of our study, a complete or available case analysis can be superior to multiple imputation, and estimation results vary depending on the assumed missingness DAG. Our analyses are possibly the first to show the applicability of missingness DAGs (m-DAGs) to complex longitudinal real-world data, while highlighting the sensitivity with respect to the assumed causal model.
翻译:多个变量中的缺失数据是一个常见问题。我们探讨了图形模型框架在处理缺失数据方面的适用性,并将其应用于一项复杂的纵向药理学研究——该研究基于CHAPAS-3试验中接受依非韦伦方案治疗的HIV阳性儿童。具体而言,我们考察了通过多个连续变量的静态干预所定义的因果效应,是否能够仅从可用数据中可恢复(即一致估计)。迄今为止,尚无通用算法可用于判断可恢复性,决策必须基于具体情况逐例进行。我们强调了可恢复性对图结构中最微小变化的敏感性,并基于临床知识,在CHAPAS-3研究中提出了三种合理的缺失有向无环图(m-DAGs)的可恢复性结果。此外,我们提出了“封闭缺失机制”的概念,并证明在该机制下,可用案例分析对于任何类型的统计和因果查询的一致估计是可接受的,即使底层缺失机制属于非随机缺失(MNAR)类型。模拟和理论分析均表明,在我们研究所假设的MNAR设定下,完全案例分析或可用案例分析可能优于多重插补,且估计结果取决于所假设的缺失DAG。我们的分析可能是首次展示缺失DAG(m-DAGs)在复杂纵向真实世界数据中的适用性,同时强调了其相对于假设因果模型的敏感性。