Recoverability of Causal Effects in a Longitudinal Study under Presence of Missing Data

Missing data in multiple variables is a common issue. We investigate the applicability of the framework of graphical models for handling missing data to a complex longitudinal pharmacological study of children with HIV treated with an efavirenz-based regimen as part of the CHAPAS-3 trial. Specifically, we examine whether the causal effects of interest, defined through static interventions on multiple continuous variables, can be recovered (estimated consistently) from the available data only. So far, no general algorithms are available to decide on recoverability, and decisions have to be made on a case-by-case basis. We emphasize sensitivity of recoverability to even the smallest changes in the graph structure, and present recoverability results for three plausible missingness directed acyclic graphs (m-DAGs) in the CHAPAS-3 study, informed by clinical knowledge. Furthermore, we propose the concept of "closed missingness mechanisms" and show that under these mechanisms an available case analysis is admissible for consistent estimation for any type of statistical and causal query, even if the underlying missingness mechanism is of missing not at random (MNAR) type. Both simulations and theoretical considerations demonstrate how, in the assumed MNAR setting of our study, a complete or available case analysis can be superior to multiple imputation, and estimation results vary depending on the assumed missingness DAG. Our analyses are possibly the first to show the applicability of missingness DAGs (m-DAGs) to complex longitudinal real-world data, while highlighting the sensitivity with respect to the assumed causal model.

翻译：在多个变量中，缺失数据是一个常见问题。我们研究了处理缺失数据的图形模型框架在儿童HIV患者复杂纵向药理学研究中的适用性，这些患者作为CHAPAS-3试验的一部分接受了基于依非韦伦的治疗方案。具体而言，我们探讨了通过静态干预在多个连续变量上定义的感兴趣因果效应是否仅能从可用数据中恢复（一致估计）。迄今为止，尚无通用算法可用于判断可恢复性，必须根据具体情况逐案决策。我们强调可恢复性对图结构中最微小变化的敏感性，并报告了CHAPAS-3研究中基于临床知识提出的三种可能缺失方向无环图（m-DAG）的可恢复性结果。此外，我们提出了“封闭缺失机制”的概念，并证明在这些机制下，可用病例分析对于任何类型的统计和因果查询的一致估计是可接受的，即使潜在的缺失机制属于非随机缺失（MNAR）类型。模拟和理论考虑均表明，在我们研究的假设MNAR情境下，完全病例分析或可用病例分析可能优于多重插补，且估计结果因假设的缺失DAG而异。我们的分析可能是首次展示缺失DAG（m-DAG）在复杂纵向真实世界数据中的适用性，同时突出其对假设因果模型的敏感性。