Parkinson's Disease (PD) is a progressive neurodegenerative disorder that significantly impacts both motor and non-motor functions, including speech. Early and accurate recognition of PD through speech analysis can greatly enhance patient outcomes by enabling timely intervention. This paper provides a comprehensive review of methods for PD recognition using speech data, highlighting advances in machine learning and data-driven approaches. We discuss the process of data wrangling, including data collection, cleaning, transformation, and exploratory data analysis, to prepare the dataset for machine learning applications. Various classification algorithms are explored, including logistic regression, SVM, and neural networks, with and without feature selection. Each method is evaluated based on accuracy, precision, and training time. Our findings indicate that specific acoustic features and advanced machine-learning techniques can effectively differentiate between individuals with PD and healthy controls. The study concludes with a comparison of the different models, identifying the most effective approaches for PD recognition, and suggesting potential directions for future research.
翻译:帕金森病(PD)是一种进行性神经退行性疾病,显著影响包括言语在内的运动和非运动功能。通过语音分析对PD进行早期准确识别,能够实现及时干预,从而极大改善患者预后。本文全面综述了利用语音数据进行PD识别的方法,重点阐述了机器学习和数据驱动方法的最新进展。我们讨论了数据整理过程,包括数据收集、清洗、转换和探索性数据分析,以准备用于机器学习应用的数据集。本文探讨了多种分类算法,包括逻辑回归、支持向量机(SVM)和神经网络,并考虑了是否进行特征选择的情况。每种方法都根据准确率、精确率和训练时间进行评估。我们的研究结果表明,特定的声学特征和先进的机器学习技术能够有效区分PD患者与健康对照组。本研究最后比较了不同模型,确定了PD识别最有效的方法,并提出了未来研究的潜在方向。