Despite the growing interest in causal and statistical inference for settings with data dependence, few methods currently exist to account for missing data in dependent data settings; most classical missing data methods in statistics and causal inference treat data units as independent and identically distributed (i.i.d.). We develop a graphical modeling based framework for causal inference in the presence of entangled missingness, defined as missingness with data dependence. We distinguish three different types of entanglements that can occur, supported by real-world examples. We give sound and complete identification results for all three settings. We show that existing missing data models may be extended to cover entanglements arising from (1) target law dependence and (2) missingness process dependence, while those arising from (3) missingness interference require a novel approach. We demonstrate the use of our entangled missingness framework on synthetic data. Finally, we discuss how, subject to a certain reinterpretation of the variables in the model, our model for missingness interference extends missing data methods to novel missing data patterns in i.i.d. settings.
翻译:尽管对存在数据依赖关系设置的因果和统计推断兴趣日益增长,但目前用于处理依赖数据场景中缺失数据的方法寥寥无几;统计学与因果推断中的大多数经典缺失数据处理方法都将数据单元视为独立同分布(i.i.d.)。我们开发了一个基于图建模的框架,用于在存在纠缠缺失(定义为具有数据依赖性的缺失)情况下进行因果推断。我们区分了三种可能发生的纠缠类型,并通过实际案例加以佐证。针对所有三种设置,我们给出了可靠且完备的识别结果。研究表明,现有的缺失数据模型可扩展至涵盖由(1)目标法则依赖性和(2)缺失过程依赖性引发的纠缠,而由(3)缺失干扰引发的纠缠则需要采用新方法。我们在合成数据上演示了所提出的纠缠缺失框架的应用。最后,我们讨论了如何在对模型中变量进行特定重新解释的情况下,将针对缺失干扰的模型扩展至i.i.d.设置中的新型缺失数据模式。