Ecological Momentary Assessments (EMA) capture real-time thoughts and behaviors in natural settings, producing rich longitudinal data for statistical and physiological analyses. However, the robustness of these analyses can be compromised by the large amount of missing in EMA data sets. To address this, multiple imputation, a method that replaces missing values with several plausible alternatives, has become increasingly popular. In this paper, we introduce a two-step Bayesian multiple imputation framework which leverages the configuration of mixed models. We adopt the Random Intercept Linear Mixed model, the Mixed-effect Location Scale model which accounts for subject variance influenced by covariates and random effects, and the Shared Parameter Location Scale Mixed Effect model which links the missing data to the response variable through a random intercept logistic model, to complete the posterior distribution within the framework. In the simulation study and an application on data from a study on caregivers of dementia patients, we further adapt this two-step Bayesian multiple imputation strategy to handle simultaneous missing variables in EMA data sets and compare the effectiveness of multiple imputations across different mixed models. The analyses highlight the advantages of multiple imputations over single imputations. Furthermore, we propose two pivotal considerations in selecting the optimal mixed model for the two-step imputation: the influence of covariates as well as random effects on the within-variance, and the nature of missing data in relation to the response variable.
翻译:生态瞬时评估(EMA)可捕获自然情境下的实时思维与行为,从而生成丰富的纵向数据供统计与生理分析使用。然而,EMA数据集中存在的大量缺失值会削弱这类分析的稳健性。为此,多重插补——一种以多个合理替代值替换缺失值的方法——日益受到关注。本文提出一种利用混合模型配置的两步贝叶斯多重插补框架。我们采用随机截距线性混合模型、可解释受协变量与随机效应影响的被试内变异性的混合效应位置尺度模型,以及通过随机截距逻辑模型将缺失数据与响应变量相关联的共享参数位置尺度混合效应模型,以在该框架内完成后验分布。在模拟研究及一项针对痴呆症患者照护者的实际数据分析应用中,我们进一步调整此两步贝叶斯多重插补策略以处理EMA数据集中同时缺失的变量,并比较不同混合模型下多重插补的效果。分析凸显了多重插补相较于单一插补的优势。此外,我们提出选择最优混合模型进行两步插补的两个关键考量因素:协变量及随机效应对被试内变异性的影响,以及缺失数据与响应变量之间关联的本质。