Detecting the Severity of Major Depressive Disorder from Speech: A Novel HARD-Training Methodology

Edward L. Campbell,Judith Dineley,Pauline Conde,Faith Matcham,Femke Lamers,Sara Siddi,Laura Docio-Fernandez,Carmen Garcia-Mateo,Nicholas Cummins,the RADAR-CNS Consortium

from arxiv, Error in Training Code

Major Depressive Disorder (MDD) is a common worldwide mental health issue with high associated socioeconomic costs. The prediction and automatic detection of MDD can, therefore, make a huge impact on society. Speech, as a non-invasive, easy to collect signal, is a promising marker to aid the diagnosis and assessment of MDD. In this regard, speech samples were collected as part of the Remote Assessment of Disease and Relapse in Major Depressive Disorder (RADAR-MDD) research programme. RADAR-MDD was an observational cohort study in which speech and other digital biomarkers were collected from a cohort of individuals with a history of MDD in Spain, United Kingdom and the Netherlands. In this paper, the RADAR-MDD speech corpus was taken as an experimental framework to test the efficacy of a Sequence-to-Sequence model with a local attention mechanism in a two-class depression severity classification paradigm. Additionally, a novel training method, HARD-Training, is proposed. It is a methodology based on the selection of more ambiguous samples for the model training, and inspired by the curriculum learning paradigm. HARD-Training was found to consistently improve - with an average increment of 8.6% - the performance of our classifiers for both of two speech elicitation tasks used and each collection site of the RADAR-MDD speech corpus. With this novel methodology, our Sequence-to-Sequence model was able to effectively detect MDD severity regardless of language. Finally, recognising the need for greater awareness of potential algorithmic bias, we conduct an additional analysis of our results separately for each gender.

翻译：重性抑郁障碍（MDD）是一种全球常见的心理健康问题，其相关社会经济成本高昂。因此，MDD的预测与自动检测能够对社会产生巨大影响。语音作为一种非侵入性、易于采集的信号，是辅助MDD诊断与评估的前景可观的生物标志物。为此，本研究从“重性抑郁障碍远程评估疾病与复发”（RADAR-MDD）研究项目中收集了语音样本。RADAR-MDD是一项观察性队列研究，对象包括西班牙、英国和荷兰有MDD病史的个体，从中采集了语音及其他数字生物标志物。本文以RADAR-MDD语音语料库为实验框架，检验了带有局部注意力机制的序列到序列（Sequence-to-Sequence）模型在二分类抑郁严重程度分类范式中的有效性。此外，本文提出了一种新型训练方法——HARD-Training。该方法受课程学习范式启发，基于选择更具歧义性的样本进行模型训练。研究发现，HARD-Training能够持续提升分类器性能（平均提升8.6%），这一提升在RADAR-MDD语音语料库的两种语音诱发任务及每个采集站点中均得到验证。借助这一新型方法，我们的序列到序列模型能够有效跨语言检测MDD严重程度。最后，鉴于对潜在算法偏差需提高认识，我们针对不同性别分别进行了额外结果分析。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日