Large Language Models for Mental Health: A Multilingual Evaluation

Large Language Models (LLMs) have remarkable capabilities across NLP tasks. However, their performance in multilingual contexts, especially within the mental health domain, has not been thoroughly explored. In this paper, we evaluate proprietary and open-source LLMs on eight mental health datasets in various languages, as well as their machine-translated (MT) counterparts. We compare LLM performance in zero-shot, few-shot, and fine-tuned settings against conventional NLP baselines that do not employ LLMs. In addition, we assess translation quality across language families and typologies to understand its influence on LLM performance. Proprietary LLMs and fine-tuned open-source LLMs achieve competitive F1 scores on several datasets, often surpassing state-of-the-art results. However, performance on MT data is generally lower, and the extent of this decline varies by language and typology. This variation highlights both the strengths of LLMs in handling mental health tasks in languages other than English and their limitations when translation quality introduces structural or lexical mismatches.

翻译：大型语言模型（LLMs）在自然语言处理任务中展现出卓越能力。然而，其在多语言环境下的表现，尤其是在心理健康领域，尚未得到充分探索。本文在涵盖多种语言的八个心理健康数据集及其机器翻译（MT）版本上，评估了专有和开源LLMs的性能。我们将LLMs在零样本、少样本和微调设置下的表现与未采用LLMs的传统自然语言处理基线方法进行比较。此外，我们评估了跨语系和语言类型的翻译质量，以理解其对LLM性能的影响。专有LLMs和经过微调的开源LLMs在多个数据集上取得了具有竞争力的F1分数，往往超越现有最佳结果。然而，在机器翻译数据上的表现普遍较低，且这种下降程度因语言和语言类型而异。这种差异既凸显了LLMs在处理英语以外的心理健康任务时的优势，也揭示了当翻译质量引入结构或词汇不匹配时的局限性。

相关内容

健康

关注 27

健康是指一个人在身体、精神和社会等方面都处于良好的状态。健康包括两个方面的内容：

一是主要脏器无疾病，身体形态发育良好，体形均匀，人体各系统具有良好的生理功能，有较强的身体活动能力和劳动能力，这是对健康最基本的要求；

二是对疾病的抵抗能力较强，能够适应环境变化，各种生理刺激以及致病因素对身体的作用。传统的健康观是“无病即健康”，现代人的健康观是整体健康，世界卫生组织提出“健康不仅是躯体没有疾病，还要具备心理健康、社会适应良好和有道德”。因此，现代人的健康内容包括：躯体健康、心理健康、心灵健康、社会健康、智力健康、道德健康、环境健康等。健康是人的基本权利。健康是人生的第一财富。

赋能大型语言模型多领域资源挑战

专知会员服务

10+阅读 · 2025年6月10日

面向医学的多模态大型语言模型：全面综述

专知会员服务

25+阅读 · 2025年5月1日

【斯坦福博士论文】大语言模型的AI辅助评估

专知会员服务

31+阅读 · 2025年3月30日

扩展英语大语言模型到新语言的综述

专知会员服务

18+阅读 · 2024年8月15日