In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese language examinations. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. From the pool of 22k questions, we utilize 4k to create the benchmark that offers balanced coverage of subjects and incorporates a diverse range of case analysis techniques.Furthermore, we evaluate a range of existing large language models~(LLMs), spanning from open-sourced to API-based models. Our experiments and analysis demonstrate that CPsyExam serves as an effective benchmark for enhancing the understanding of psychology within LLMs and enables the comparison of LLMs across various granularities.
翻译:本文提出了一个新颖的心理学基准——CPsyExam,其题目来源于中文语言考试。鉴于将心理学知识应用于真实场景的重要性,CPsyExam专门对心理学知识和案例分析能力进行分别评估。我们从2.2万道题目中精选4000道构建该基准,实现了对学科领域的均衡覆盖,并融入了多样化的案例分析技术。此外,我们评估了从开源模型到基于API的模型等一系列现有大型语言模型(LLMs)。实验与分析表明,CPsyExam不仅是增强大语言模型心理学理解能力的有效基准,还能支持跨不同粒度对LLMs进行性能比较。