Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordings from 265 participants using the Whisper tool and BERTopic model. Six topics with a median PHQ-8 greater than or equal to 10 were regarded as risk topics for depression: No Expectations, Sleep, Mental Therapy, Haircut, Studying, and Coursework. To elucidate the topic emergence and associations with depression, we compared behavioral (from wearables) and linguistic characteristics across identified topics. The correlation between topic shifts and changes in depression severity over time was also investigated, indicating the importance of longitudinally monitoring language use. We also tested the BERTopic model on a similar smaller dataset (356 speech recordings from 57 participants), obtaining some consistent results. In summary, our findings demonstrate specific speech topics may indicate depression severity. The presented data-driven workflow provides a practical approach to collecting and analyzing large-scale speech data from real-world settings for digital health research.
翻译:语言使用已被证明与抑郁存在关联,但尚需大规模验证。传统临床研究等方法成本高昂。因此,自然语言处理已被应用于社交媒体进行抑郁预测,但仍存在局限性:缺乏经过验证的标注、样本存在用户偏差、缺乏语境。本研究利用Whisper工具与BERTopic模型,对265名参与者通过智能手机收集的3919份发言录音进行分析,识别出29个话题。其中六个中位PHQ-8量表得分大于等于10的话题被认定为抑郁风险话题:无期待、睡眠、心理治疗、理发、学习、课程作业。为阐明话题出现机制及其与抑郁的关联,我们比较了识别出的各个话题在行为(来自可穿戴设备)与语言特征上的差异。同时探究了话题转换与抑郁严重程度随时间变化的相关性,提示纵向监测语言使用的重要性。我们还在类似较小规模数据集(57名参与者的356份发言录音)上测试了BERTopic模型,获得部分一致结果。综上,我们发现特定言语话题可能反映抑郁严重程度。本研究提出的数据驱动工作流为数字健康研究提供了从现实环境收集和分析大规模语音数据的实用方法。