With the growing prevalence of diabetes and the associated public health burden, it is crucial to identify modifiable factors that could improve patients' glycemic control. In this work, we seek to examine associations between medication usage, concurrent comorbidities, and glycemic control, utilizing data from continuous glucose monitor (CGMs). CGMs provide interstitial glucose measurements, but reducing data to simple statistical summaries is common in clinical studies, resulting in substantial information loss. Recent advancements in the Frechet regression framework allow to utilize more information by treating the full distributional representation of CGM data as the response, while sparsity regularization enables variable selection. However, the methodology does not scale to large datasets. Crucially, variable selection inference using subsampling methods is computationally infeasible. We develop a new algorithm for sparse distributional regression by deriving a new explicit characterization of the gradient and Hessian of the underlying objective function, while also utilizing rotations on the sphere to perform feasible updates. The updated method is up to 10000-fold faster than the original approach, opening the door for applying sparse distributional regression to large-scale datasets and enabling previously unattainable subsampling-based inference. Applying our method to CGM data from patients with type 2 diabetes and obstructive sleep apnea, we found a significant association between sulfonylurea medication and glucose variability without evidence of association with glucose mean. We also found that overnight oxygen desaturation variability showed a stronger association with glucose regulation than overall oxygen desaturation levels.
翻译:随着糖尿病患病率的持续上升及其带来的公共健康负担,识别可改善患者血糖控制的可调控因素至关重要。本研究利用连续血糖监测仪数据,旨在探究药物使用、合并症与血糖控制之间的关联。连续血糖监测仪可提供间质葡萄糖测量值,但临床研究常将数据简化为简单统计汇总,导致大量信息损失。近期弗雷歇回归框架的进展允许通过将连续血糖监测数据的全分布表征作为响应变量来利用更多信息,同时稀疏正则化实现了变量选择。然而,该方法难以扩展至大规模数据集。关键的是,基于子采样方法的变量选择推断在计算上不可行。我们通过推导目标函数梯度和海森矩阵的新型显式表征,并利用球面上的旋转实现可行更新,开发了一种新的稀疏分布回归算法。改进后方法的速度比原始方法提升高达一万倍,为将稀疏分布回归应用于大规模数据集并实现此前无法达到的基于子采样的推断打开了大门。将该方法应用于2型糖尿病合并阻塞性睡眠呼吸暂停患者的连续血糖监测数据,我们发现磺脲类药物与血糖变异性存在显著关联,但与平均血糖水平无显著关联。同时,夜间氧饱和度变异性对血糖调节的影响强于整体氧饱和度水平。