Detecting Parkinson's Disease in its early stages using EEG data presents a significant challenge. This paper introduces a novel approach, representing EEG data as a 15-variate series of bandpower and peak frequency values/coefficients. The hypothesis is that this representation captures essential information from the noisy EEG signal, improving disease detection. Statistical features extracted from this representation are utilised as input for interpretable machine learning models, specifically Decision Tree and AdaBoost classifiers. Our classification pipeline is deployed within our proposed framework which enables high-importance data types and brain regions for classification to be identified. Interestingly, our analysis reveals that while there is no significant regional importance, the N1 sleep data type exhibits statistically significant predictive power (p < 0.01) for early-stage Parkinson's Disease classification. AdaBoost classifiers trained on the N1 data type consistently outperform baseline models, achieving over 80% accuracy and recall. Our classification pipeline statistically significantly outperforms baseline models indicating that the model has acquired useful information. Paired with the interpretability (ability to view feature importance's) of our pipeline this enables us to generate meaningful insights into the classification of early stage Parkinson's with our N1 models. In Future, these models could be deployed in the real world - the results presented in this paper indicate that more than 3 in 4 early-stage Parkinson's cases would be captured with our pipeline.
翻译:利用脑电图数据检测早期帕金森病是一项重大挑战。本文提出了一种新方法,将脑电图数据表示为包含波段功率和峰值频率值/系数的15变量时间序列。假设该表示能有效提取噪声脑电图信号中的关键信息,从而提高疾病检测能力。从该表示中提取的统计特征被用作可解释机器学习模型的输入,具体采用决策树和AdaBoost分类器。我们的分类流程部署在所提出的框架内,该框架可识别对分类具有高重要性的数据类型和脑区。有趣的是,分析结果显示,虽然脑区重要性不显著,但N1睡眠数据类型在早期帕金森病分类中表现出统计显著的预测能力(p < 0.01)。基于N1数据训练的AdaBoost分类器持续优于基线模型,准确率和召回率均超过80%。我们的分类流程在统计上显著优于基线模型,表明模型已获取有用信息。结合流程的可解释性(能够查看特征重要性),我们得以通过N1模型对早期帕金森病分类产生有意义的见解。未来这些模型可部署于实际应用——本文结果表明,我们的流程能够识别超过四分之三的早期帕金森病例。