Pathological speech analysis has been of interest in the detection of certain diseases like depression and Alzheimer's disease and attracts much interest from researchers. However, previous pathological speech analysis models are commonly designed for a specific disease while overlooking the connection between diseases, which may constrain performance and lower training efficiency. Instead of fine-tuning deep models for different tasks, prompt tuning is a much more efficient training paradigm. We thus propose a unified pathological speech analysis system for as many as three diseases with the prompt tuning technique. This system uses prompt tuning to adjust only a small part of the parameters to detect different diseases from speeches of possible patients. Our system leverages a pre-trained spoken language model and demonstrates strong performance across multiple disorders while only fine-tuning a fraction of the parameters. This efficient training approach leads to faster convergence and improved F1 scores by allowing knowledge to be shared across tasks. Our experiments on Alzheimer's disease, Depression, and Parkinson's disease show competitive results, highlighting the effectiveness of our method in pathological speech analysis.
翻译:病理语音分析在抑郁症和阿尔茨海默病等特定疾病的检测中具有重要意义,并引起了研究者的广泛关注。然而,以往的病理语音分析模型通常针对单一疾病设计,忽视了疾病间的关联,这可能限制模型性能并降低训练效率。相较于为不同任务微调深度模型,提示调优是一种更为高效的训练范式。为此,我们提出了一种基于提示调优技术的统一病理语音分析系统,可同时处理多达三种疾病。该系统通过提示调优仅调整少量参数,即可从潜在患者的语音中检测不同疾病。我们的系统利用预训练的语音语言模型,在仅微调部分参数的情况下,对多种疾病均表现出优异的性能。这种高效的训练方法通过跨任务知识共享,实现了更快的收敛速度和更高的F1分数。我们在阿尔茨海默病、抑郁症和帕金森病数据集上的实验取得了具有竞争力的结果,充分证明了该方法在病理语音分析中的有效性。