The detection of Alzheimer's disease (AD) from spontaneous speech has attracted increasing attention while the sparsity of training data remains an important issue. This paper handles the issue by knowledge transfer, specifically from both speech-generic and depression-specific knowledge. The paper first studies sequential knowledge transfer from generic foundation models pretrained on large amounts of speech and text data. A block-wise analysis is performed for AD diagnosis based on the representations extracted from different intermediate blocks of different foundation models. Apart from the knowledge from speech-generic representations, this paper also proposes to simultaneously transfer the knowledge from a speech depression detection task based on the high comorbidity rates of depression and AD. A parallel knowledge transfer framework is studied that jointly learns the information shared between these two tasks. Experimental results show that the proposed method improves AD and depression detection, and produces a state-of-the-art F1 score of 0.928 for AD diagnosis on the commonly used ADReSSo dataset.
翻译:阿尔茨海默病(AD)的自发语音检测日益受到关注,但训练数据稀疏性仍是一个重要问题。本文通过知识迁移(具体而言包含语音通用知识与抑郁特异性知识迁移)来解决该问题。首先研究了基于大规模语音和文本数据预训练的通用基础模型的序列化知识迁移方法,通过对不同基础模型中间模块提取的表征进行分块分析实现AD诊断。除语音通用表征知识外,基于AD与抑郁症的高共病率,本文还提出同步迁移语音抑郁症检测任务的知识。为此构建了并行知识迁移框架,联合学习两个任务间的共享信息。实验结果表明,所提方法能有效提升AD与抑郁症检测性能,在常用ADReSSo数据集上AD诊断F1值达到当前最优的0.928。