Scalable regression calibration approaches to correcting measurement error in multi-level generalized functional linear regression models with heteroscedastic measurement errors

2023 年 5 月 22 日

翻译：可扩展回归校准方法在异方差测量误差的多层广义函数线性回归模型中校正测量误差

Roger S. Zoh,Yuanyuan Luan,Erjia Cui,Xiaoxin Yu,Heyang Ji,Sneha Jadhav,Carmen D. Tekwe

Wearable devices permit the continuous monitoring of biological processes, such as blood glucose metabolism, and behavior, such as sleep quality and physical activity. The continuous monitoring often occurs in epochs of 60 seconds over multiple days, resulting in high dimensional longitudinal curves that are best described and analyzed as functional data. From this perspective, the functional data are smooth, latent functions obtained at discrete time intervals and prone to homoscedastic white noise. However, the assumption of homoscedastic errors might not be appropriate in this setting because the devices collect the data serially. While researchers have previously addressed measurement error in scalar covariates prone to errors, less work has been done on correcting measurement error in high dimensional longitudinal curves prone to heteroscedastic errors. We present two new methods for correcting measurement error in longitudinal functional curves prone to complex measurement error structures in multi-level generalized functional linear regression models. These methods are based on two-stage scalable regression calibration. We assume that the distribution of the scalar responses and the surrogate measures prone to heteroscedastic errors both belong in the exponential family and that the measurement errors follow Gaussian processes. In simulations and sensitivity analyses, we established some finite sample properties of these methods. In our simulations, both regression calibration methods for correcting measurement error performed better than estimators based on averaging the longitudinal functional data and using observations from a single day. We also applied the methods to assess the relationship between physical activity and type 2 diabetes in community dwelling adults in the United States who participated in the National Health and Nutrition Examination Survey.

翻译：可穿戴设备能够持续监测生物过程（如血糖代谢）和行为（如睡眠质量与体力活动）。这种持续监测通常以60秒为间隔持续数天，产生的高维纵向曲线最适合作为函数数据进行分析。从该视角看，函数数据是离散时间点观测到的光滑潜在函数，易受同方差白噪声影响。然而，由于设备按序列采集数据，同方差误差假设在此场景下可能不适用。尽管已有研究处理过含测量误差的标量协变量，但针对高维纵向曲线中异方差测量误差的校正研究尚不充分。本文提出两种新方法，用于纠正多层广义函数线性回归模型中具有复杂测量误差结构的纵向函数曲线误差，这些方法基于两阶段可扩展回归校准。我们假设标量响应变量与易受异方差误差影响的代理测量值均服从指数族分布，且测量误差遵循高斯过程。通过模拟与敏感性分析，我们验证了这些方法的有限样本性质。在模拟中，两种回归校准方法在校正测量误差方面的表现均优于基于纵向函数数据均值化或单日观测值的估计量。我们还将该方法应用于评估参与美国国家健康与营养调查的社区成年居民中体力活动与2型糖尿病的关系。