Parkinson's disease (PD) is a neurological disorder impacting a person's speech. Among automatic PD assessment methods, deep learning models have gained particular interest. Recently, the community has explored cross-pathology and cross-language models which can improve diagnostic accuracy even further. However, strict patient data privacy regulations largely prevent institutions from sharing patient speech data with each other. In this paper, we employ federated learning (FL) for PD detection using speech signals from 3 real-world language corpora of German, Spanish, and Czech, each from a separate institution. Our results indicate that the FL model outperforms all the local models in terms of diagnostic accuracy, while not performing very differently from the model based on centrally combined training sets, with the advantage of not requiring any data sharing among collaborators. This will simplify inter-institutional collaborations, resulting in enhancement of patient outcomes.
翻译:帕金森病(PD)是一种影响患者语音的神经系统疾病。在自动PD评估方法中,深度学习模型备受关注。近期,学界探索了跨病理与跨语言模型,这些模型可进一步提升诊断准确率。然而,严格的患者数据隐私法规在很大程度上阻碍了各机构之间共享患者语音数据。本文采用联邦学习(FL)方法,利用德语、西班牙语和捷克语三种真实世界语言语料库(分别来自三个独立机构)中的语音信号进行PD检测。结果表明,在诊断准确率方面,联邦学习模型优于所有本地模型,且与基于集中式联合训练集的模型性能差异不大,同时无需合作方之间共享任何数据。这简化了跨机构协作,从而有助于改善患者预后。