AI-Powered Early Diagnosis of Mental Health Disorders from Real-World Clinical Conversations

Mental health disorders remain among the leading cause of disability worldwide, yet conditions such as depression, anxiety, and Post-Traumatic Stress Disorder (PTSD) are frequently underdiagnosed or misdiagnosed due to subjective assessments, limited clinical resources, and stigma and low awareness. In primary care settings, studies show that providers misidentify depression or anxiety in over 60% of cases, highlighting the urgent need for scalable, accessible, and context-aware diagnostic tools that can support early detection and intervention. In this study, we evaluate the effectiveness of machine learning models for mental health screening using a unique dataset of 553 real-world, semistructured interviews, each paried with ground-truth diagnoses for major depressive episodes (MDE), anxiety disorders, and PTSD. We benchmark multiple model classes, including zero-shot prompting with GPT-4.1 Mini and MetaLLaMA, as well as fine-tuned RoBERTa models using LowRank Adaptation (LoRA). Our models achieve over 80% accuracy across diagnostic categories, with especially strongperformance on PTSD (up to 89% accuracy and 98% recall). We also find that using shorter context, focused context segments improves recall, suggesting that focused narrative cues enhance detection sensitivity. LoRA fine-tuning proves both efficient and effective, with lower-rank configurations (e.g., rank 8 and 16) maintaining competitive performance across evaluation metrics. Our results demonstrate that LLM-based models can offer substantial improvements over traditional self-report screening tools, providing a path toward low-barrier, AI-powerd early diagnosis. This work lays the groundwork for integrating machine learning into real-world clinical workflows, particularly in low-resource or high-stigma environments where access to timely mental health care is most limited.

翻译：心理健康障碍仍是全球范围内致残的主要原因之一，然而抑郁症、焦虑症和创伤后应激障碍（PTSD）等疾病常因主观评估、临床资源有限、社会污名化及认知度低而被漏诊或误诊。在初级诊疗环境中，研究表明医生对抑郁症或焦虑症的误判率超过60%，这凸显了对可扩展、易获取且具有情境感知能力的诊断工具的迫切需求，以支持早期发现和干预。本研究利用一个独特的真实世界半结构化访谈数据集（共553例），评估机器学习模型在心理健康筛查中的有效性，每例访谈均配有重度抑郁发作（MDE）、焦虑症和PTSD的真实诊断标签。我们对多类模型进行了基准测试，包括使用GPT-4.1 Mini和MetaLLaMA的零样本提示方法，以及采用低秩自适应（LoRA）微调的RoBERTa模型。我们的模型在所有诊断类别中均达到超过80%的准确率，其中对PTSD的识别表现尤为突出（准确率最高达89%，召回率达98%）。研究还发现，使用聚焦的短上下文片段能提升召回率，表明聚焦的叙事线索可增强检测敏感性。LoRA微调被证明兼具高效性与有效性，低秩配置（如秩为8和16）能在各项评估指标中保持竞争力。我们的结果表明，基于大语言模型的工具相较传统自评筛查工具可实现显著改进，为低门槛、人工智能驱动的早期诊断提供了可行路径。本研究为将机器学习整合至真实世界临床工作流程奠定了基础，尤其在资源匮乏或污名化严重、难以及时获得心理健康服务的环境中具有重要应用前景。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日