Estimation and inference with modern longitudinal data from wearable devices, which consist of biological signals at high-frequency time points, is burdened by massive computational costs. We propose a distributed estimation and inference procedure that efficiently estimates both functional and scalar parameters with intensively measured longitudinal outcomes. The procedure overcomes computational difficulties through a scalable divide-and-conquer algorithm that partitions the outcomes into smaller sets. We circumvent traditional basis selection problems by analyzing data using quadratic inference functions in smaller subsets such that the basis functions have a low dimension. To address the challenges of combining estimates from dependent subsets, we propose a statistically efficient one-step estimator derived from a constrained generalized method of moments objective function with a smoothing penalty. We show theoretically and numerically that the proposed estimator is as statistically efficient as non-distributed alternative approaches and more efficient computationally. We demonstrate the practicality of our approach with the analysis of accelerometer data from the National Health and Nutrition Examination Survey.
翻译:摘要:来自可穿戴设备的现代纵向数据包含高频时间点的生物信号,其估计与推断面临巨大的计算成本负担。我们提出了一种分布式估计与推断方法,能够高效估计密集测量纵向结果中的功能参数和标量参数。该方法通过可扩展的分治算法将结果划分为更小的数据集,从而克服计算困难。我们避免传统基函数选择问题,通过在小数据子集中使用二次推断函数分析数据,使得基函数具有低维度。为解决依赖性子集估计组合的挑战,我们提出了一种统计高效的一步估计量,该估计量基于带有平滑惩罚的约束广义矩目标函数推导。我们从理论和数值上证明,所提估计量在统计效率上与非分布式替代方法相当,且计算效率更高。通过分析国家健康与营养调查中的加速度计数据,我们展示了该方法的实用性。