Memory disorders are a central factor in the decline of functioning and daily activities in elderly individuals. The confirmation of the illness, initiation of medication to slow its progression, and the commencement of occupational therapy aimed at maintaining and rehabilitating cognitive abilities require a medical diagnosis. The early identification of symptoms of memory disorders, especially the decline in cognitive abilities, plays a significant role in ensuring the well-being of populations. Features related to speech production are known to connect with the speaker's cognitive ability and changes. The lack of standardized speech tests in clinical settings has led to a growing emphasis on developing automatic machine learning techniques for analyzing naturally spoken language. Non-lexical but acoustic properties of spoken language have proven useful when fast, cost-effective, and scalable solutions are needed for the rapid diagnosis of a disease. The work presents an approach related to feature selection, allowing for the automatic selection of the essential features required for diagnosis from the Geneva minimalistic acoustic parameter set and relative speech pauses, intended for automatic paralinguistic and clinical speech analysis. These features are refined into word histogram features, in which machine learning classifiers are trained to classify control subjects and dementia patients from the Dementia Bank's Pitt audio database. The results show that achieving a 75% average classification accuracy with only twenty-five features with the separate ADReSS 2020 competition test data and the Leave-One-Subject-Out cross-validation of the entire competition data is possible. The results rank at the top compared to international research, where the same dataset and only acoustic features have been used to diagnose patients.
翻译:记忆障碍是导致老年人功能衰退及日常活动能力下降的核心因素。疾病确诊、延缓病程进展的药物干预启动,以及旨在维持和康复认知能力的职业治疗开展,均需依赖医学诊断。早期识别记忆障碍症状(特别是认知能力衰退)对保障人群健康具有重大意义。已知与语音产生相关的特征能够反映说话者的认知能力及其变化。由于临床环境中缺乏标准化语音测试,开发用于分析自然口语的自动机器学习技术日益受到重视。在需要快速、经济且可扩展的疾病诊断方案时,口语的非词汇声学特性已被证实具有实用价值。本研究提出一种基于特征选择的方法,能够从日内瓦最小声学参数集及相对语音停顿中自动筛选诊断所需的核心特征——该参数集专门用于自动副语言分析与临床语音分析。这些特征被优化为词直方图特征,进而训练机器学习分类器,对痴呆症数据库(Dementia Bank)Pitt音频库中的对照组受试者与痴呆症患者进行分类。结果表明:仅使用25个特征,在ADReSS 2020竞赛独立测试数据及全竞赛数据留一受试者交叉验证中,即可实现75%的平均分类准确率。该结果在与采用相同数据集且仅使用声学特征进行患者诊断的国际研究中位列前茅。