Fine-tuning LLMs for Passive Depression Severity Estimation from AI Mental Health Dialogue

Depression is the leading cause of disability worldwide, and early detection of symptom change is essential for timely intervention. Validated instruments such as the Patient Health Questionnaire-9 (PHQ-9) support symptom monitoring at scale, but real-world completion rates are low, introducing response bias and systematic missingness. Passive approaches that infer severity from routinely generated data could close this gap. We address this by predicting PHQ-9 total scores directly from transcripts of conversations between users and an AI mental health application, requiring only conversation text and no additional clinical data. We fine-tune a Qwen3.5-27B backbone with a regression head, augment 3,111 ground-truth labels with pseudolabels generated by a reasoning model (Claude Opus) and iteratively trained intermediate models, for a combined dataset of 6,283 users. On a held-out test set of 842 users, our best model achieves MAE = 2.6, RMSE = 4.0, Pearson r = 0.80, and AUC = 0.91 at the PHQ-9 >= 10 clinical threshold. We also find AUC > 0.87 at every severity threshold from PHQ-9 >= 3 to PHQ-9 >= 24, demonstrating that the model captures depression severity across the full clinical spectrum. This work opens the door to passive, continuous symptom monitoring in AI mental health platforms, without requiring users to complete self-report measures.

翻译：抑郁症是全球致残的首要原因，早期发现症状变化对及时干预至关重要。患者健康问卷-9项（PHQ-9）等经过验证的工具支持大规模症状监测，但实际完成率较低，导致应答偏倚和系统性数据缺失。通过常规生成数据推断严重程度的被动方法有望弥补这一缺口。本研究通过直接预测用户与AI心理健康应用对话记录中的PHQ-9总分来解决这一问题，仅需对话文本且无需额外临床数据。我们采用带有回归头的Qwen3.5-27B骨干网络进行微调，利用推理模型（Claude Opus）和迭代训练的中间模型生成的伪标签扩充3,111个真实标签，形成包含6,283名用户的联合数据集。在842名用户的保留测试集上，我们的最佳模型在PHQ-9 >= 10临床阈值下达到了MAE = 2.6、RMSE = 4.0、Pearson r = 0.80、AUC = 0.91的效果。同时，从PHQ-9 >= 3到PHQ-9 >= 24的每个严重程度阈值下AUC均大于0.87，表明该模型能够捕捉整个临床谱系的抑郁症严重程度。本工作为AI心理健康平台实现被动式连续症状监测开辟了道路，无需用户完成自我报告量表。

相关内容

健康

关注 27

健康是指一个人在身体、精神和社会等方面都处于良好的状态。健康包括两个方面的内容：

一是主要脏器无疾病，身体形态发育良好，体形均匀，人体各系统具有良好的生理功能，有较强的身体活动能力和劳动能力，这是对健康最基本的要求；

二是对疾病的抵抗能力较强，能够适应环境变化，各种生理刺激以及致病因素对身体的作用。传统的健康观是“无病即健康”，现代人的健康观是整体健康，世界卫生组织提出“健康不仅是躯体没有疾病，还要具备心理健康、社会适应良好和有道德”。因此，现代人的健康内容包括：躯体健康、心理健康、心灵健康、社会健康、智力健康、道德健康、环境健康等。健康是人的基本权利。健康是人生的第一财富。

大语言模型评估技术研究进展

专知会员服务

49+阅读 · 2024年7月9日

《数据驱动型危机决策中的偏见和去偏见》2023最新27页博士论文

专知会员服务

31+阅读 · 2023年9月5日

【CMU博士论文】深度神经网络鲁棒训练与评估方法，101页pdf

专知会员服务

47+阅读 · 2023年6月18日

构建基于生物医学文献的抑郁症知识图谱

专知会员服务

12+阅读 · 2022年11月14日