Missing data is a common challenge in studying treatment effects. In the context of mediation analysis, this paper addresses missingness in the mediator and outcome, focusing on identification. We first consider self-separated missingness models where identification is achieved by conditional independence assumptions. This model class is somewhat limited as it is constrained by the need to remove a certain number of connections from the model. We then turn to self-connected missingness models where identification relies on information from shadow variables. This model class turns out to contain substantial variation, allowing models with built-in shadow variables (mediator, outcome or covariates) and models with auxiliary shadow variables at different positions in the causal structure. To improve the practical value of the missingness mechanisms, we allow where possible for dependencies due to unobserved causes of the missingness, a feature often neglected. In this exploration, we review existing models, connect to new models, and develop theory where needed. This results in templates for identification in the mediation setting, generally useful identification techniques, and perhaps most importantly a synthesis and substantial extension of shadow variable theory. Two examples relate the models to practical considerations.
翻译:暂无翻译