Mental health has attracted substantial attention in recent years and LLM can be an effective technology for alleviating this problem owing to its capability in text understanding and dialogue. However, existing research in this domain often suffers from limitations, such as training on datasets lacking crucial prior knowledge and evidence, and the absence of comprehensive evaluation methods. In this paper, we propose a specialized psychological large language model (LLM), named PsycoLLM, trained on a proposed high-quality psychological dataset, including single-turn QA, multi-turn dialogues and knowledge-based QA. Specifically, we construct multi-turn dialogues through a three-step pipeline comprising multi-turn QA generation, evidence judgment, and dialogue refinement. We augment this process with real-world psychological case backgrounds extracted from online platforms, enhancing the relevance and applicability of the generated data. Additionally, to compare the performance of PsycoLLM with other LLMs, we develop a comprehensive psychological benchmark based on authoritative psychological counseling examinations in China, which includes assessments of professional ethics, theoretical proficiency, and case analysis. The experimental results on the benchmark illustrate the effectiveness of PsycoLLM, which demonstrates superior performance compared to other LLMs.
翻译:近年来心理健康问题受到广泛关注,大型语言模型因其在文本理解与对话方面的能力,可成为缓解该问题的有效技术。然而,该领域现有研究常存在局限性,例如训练数据缺乏关键先验知识与证据支撑,以及缺乏系统化的评估方法。本文提出一种专用于心理学领域的大型语言模型PsycoLLM,该模型基于我们构建的高质量心理学数据集进行训练,包含单轮问答、多轮对话及知识型问答。具体而言,我们通过包含多轮问答生成、证据判断和对话优化的三步流程构建多轮对话数据,并融合从在线平台提取的真实心理案例背景,以增强生成数据的相关性与适用性。此外,为比较PsycoLLM与其他大型语言模型的性能,我们基于中国权威心理咨询考试构建了综合性心理学评测基准,涵盖职业道德、理论素养与案例分析三个维度的评估。在该基准上的实验结果表明,PsycoLLM相比其他大型语言模型展现出更优越的性能。