Modeling Parkinson's Disease Progression Using Longitudinal Voice Biomarkers: A Comparative Study of Statistical and Neural Mixed-Effects Models

from arxiv, Published version: Computer Methods and Programs in Biomedicine Update, DOI: 10.1016/j.cmpbup.2026.100242. Version note: https://doi.org/10.5281/zenodo.19804672

Longitudinal voice biomarkers provide a non-invasive source of information for monitoring Parkinson's disease progression, but their statistical analysis is difficult because repeated measurements from the same subject are correlated, clinical cohorts are often small, and disease trajectories can vary substantially across individuals. This study evaluates statistical and neural mixed-effects approaches for modeling Parkinson's disease progression from telemonitoring voice data. Using the Oxford Parkinson's telemonitoring dataset (N=42), we compare Neural Mixed Effects (NME) models, Generalized Neural Network Mixed Models (GNMMs), and semi-parametric Generalized Additive Mixed Models (GAMMs) under the same longitudinal prediction setting. The results show that neural mixed-effects models provide flexible nonlinear representations but can overfit severely in this small-sample setting, whereas GAMMs achieve stronger predictive performance and retain interpretable smooth effects and subject-level structure. In particular, the GAMM-based approach attains the lowest prediction error (MSE 6.56), while the neural baselines have substantially larger errors (MSE > 90). These findings support the use of interpretable statistical mixed-effects models for small longitudinal telemonitoring studies and suggest that larger and more diverse cohorts are needed before highly flexible neural mixed-effects models can be reliably assessed in this application.

翻译：纵向语音生物标志物为监测帕金森病进展提供了一种非侵入性信息来源，但其统计分析存在困难，原因包括同一受试者的重复测量数据具有相关性、临床队列规模通常较小，以及疾病轨迹在不同个体间可能存在显著差异。本研究评估了利用远程监测语音数据建模帕金森病进展的统计方法与神经混合效应方法。基于牛津帕金森病远程监测数据集（N=42），我们在相同的纵向预测设置下比较了神经混合效应（NME）模型、广义神经网络混合模型（GNMMs）以及半参数广义可加混合模型（GAMMs）。结果表明，神经混合效应模型能够提供灵活的非线性表征，但在此小样本设置中可能严重过拟合；而GAMMs则实现了更强的预测性能，并保留了可解释的平滑效应和受试者层级结构。特别地，基于GAMMs的方法取得了最低的预测误差（MSE为6.56），而神经基线模型的误差则显著较大（MSE > 90）。这些发现支持在小样本纵向远程监测研究中采用可解释的统计混合效应模型，并表明在能够可靠评估高度灵活的神经混合效应模型在此类应用中的表现之前，需要更大且更多样化的队列数据。