Automatic speech recognition (ASR) technology can aid in the detection, monitoring, and assessment of depressive symptoms in individuals. ASR systems have been used as a tool to analyze speech patterns and characteristics that are indicative of depression. Depression affects not only a person's mood but also their speech patterns. Individuals with depression may exhibit changes in speech, such as slower speech rate, longer pauses, reduced pitch variability, and decreased overall speech fluency. Despite the growing use of machine learning in diagnosing depression, there is a lack of studies addressing the issue of relapse. Furthermore, previous research on relapse prediction has primarily focused on clinical variables and has not taken into account other factors such as verbal and non-verbal cues. Another major challenge in depression relapse research is the scarcity of publicly available datasets. To overcome these issues, we propose a one-shot learning framework for detecting depression relapse from speech. We define depression relapse as the similarity between the speech audio and textual encoding of a subject and that of a depressed individual. To detect depression relapse based on this definition, we employ a Siamese neural network that models the similarity between of two instances. Our proposed approach shows promising results and represents a new advancement in the field of automatic depression relapse detection and mental disorders monitoring.
翻译:自动语音识别技术可用于检测、监测和评估个体抑郁症状。ASR系统已被用作分析指示抑郁症的言语模式与特征的工具。抑郁症不仅影响人的情绪,还会改变其言语模式。抑郁症患者可能表现出语速变慢、停顿延长、音高变异性降低以及整体言语流畅性下降等言语特征变化。尽管机器学习在抑郁症诊断中的应用日益广泛,但针对复发问题的研究仍显不足。此外,既往复发预测研究主要聚焦临床变量,尚未考虑言语与非言语线索等其他因素。抑郁症复发研究的另一重大挑战是公开数据集的匮乏。为解决这些问题,我们提出了一种基于单样本学习的框架,通过语音检测抑郁症复发。我们将抑郁症复发定义为受试者语音音频与文本编码同抑郁症患者相应特征之间的相似度。基于这一定义,我们采用孪生神经网络对两个实例间的相似度进行建模。所提方法展现出良好效果,标志着自动抑郁症复发检测与精神障碍监测领域的新进展。