In experimental and observational data settings, researchers often have limited knowledge of the reasons for missing outcomes. To address this uncertainty, we propose bounds on causal effects for missing outcomes, accommodating the scenario where missingness is an unobserved mixture of informative and non-informative components. Within this mixed missingness framework, we explore several assumptions to derive bounds on causal effects, including bounds expressed as a function of user-specified sensitivity parameters. We develop influence-function based estimators of these bounds to enable flexible, non-parametric, and machine learning based estimation, achieving root-n convergence rates and asymptotic normality under relatively mild conditions. We further consider the identification and estimation of bounds for other causal quantities that remain meaningful when informative missingness reflects a competing outcome, such as death. We conduct simulation studies and illustrate our methodology with a study on the causal effect of antipsychotic drugs on diabetes risk using a health insurance dataset.
翻译:在实验与观测数据环境中,研究者通常对结果缺失的原因认知有限。为应对这种不确定性,我们针对缺失结果提出了因果效应的边界估计方法,该方法适用于缺失机制为未观测到的信息性与非信息性成分混合的场景。在此混合缺失框架下,我们通过多种假设推导因果效应的边界,包括可表示为用户指定敏感性参数函数的边界形式。我们开发了基于影响函数的边界估计量,以实现灵活的非参数及基于机器学习的估计,在相对温和条件下达到根号n收敛速率并满足渐近正态性。我们进一步探讨了当信息性缺失反映竞争性结果(如死亡)时,对其他具有实际意义的因果量进行边界识别与估计的方法。通过模拟研究,我们以健康保险数据集为例,展示了抗精神病药物对糖尿病风险因果效应的研究方法应用。