Identification enhanced generalised linear model estimation with nonignorable missing outcomes

Missing data often result in undesirable bias and loss of efficiency. These issues become substantial when the response mechanism is nonignorable, meaning that the response model depends on unobserved variables. To manage nonignorable nonresponse, it is necessary to estimate the joint distribution of unobserved variables and response indicators. However, model misspecification and identification issues can prevent robust estimates, even with careful estimation of the target joint distribution. In this study, we modeled the distribution of the observed parts and derived sufficient conditions for model identifiability, assuming a logistic regression model as the response mechanism and generalized linear models as the main outcome model of interest. More importantly, the derived sufficient conditions do not require any instrumental variables, which are often assumed to guarantee model identifiability but cannot be practically determined beforehand. To analyze missing data in applications, we propose practical guidelines and sensitivity analysis to determine the response mechanism. Furthermore, we present the performance of the proposed estimators in numerical studies and apply the proposed method to two sets of real data: exit polls from the 19th South Korean election and public data collected from the Korean Survey of Household Finances and Living Conditions.

翻译：缺失数据常导致不理想的偏差和效率损失。当响应机制不可忽略时（即响应模型依赖于未观测变量），这些问题会变得尤为显著。为处理不可忽略的无响应问题，必须估计未观测变量与响应指示符的联合分布。然而，即使对目标联合分布进行精细估计，模型误设与识别问题仍可能阻碍稳健估计的获得。本研究对观测部分的分布进行建模，并在假设逻辑回归模型作为响应机制、广义线性模型作为主要目标结果模型的前提下，推导出模型可识别性的充分条件。更重要的是，所推导的充分条件无需任何工具变量——这类变量虽常被假设用于保证模型可识别性，但实际中无法预先确定。针对应用中的缺失数据分析，我们提出了确定响应机制的实用指南与敏感性分析方法。此外，我们通过数值研究展示了所提估计量的性能，并将该方法应用于两组真实数据：第19届韩国大选出口民调数据，以及韩国家庭金融与生活状况调查收集的公开数据。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日