In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese language examinations. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. From the pool of 22k questions, we utilize 4k to create the benchmark that offers balanced coverage of subjects and incorporates a diverse range of case analysis techniques.Furthermore, we evaluate a range of existing large language models~(LLMs), spanning from open-sourced to API-based models. Our experiments and analysis demonstrate that CPsyExam serves as an effective benchmark for enhancing the understanding of psychology within LLMs and enables the comparison of LLMs across various granularities.
翻译:本文提出了一种新颖的心理学基准测试CPsyExam,其题目来源于中文语言考试。CPsyExam的设计旨在分别优先考察心理学知识和案例分析能力,强调将心理学知识应用于实际场景的重要性。我们从22k道题目中选取4k道构建该基准,确保了学科覆盖的均衡性,并融入了多样化的案例分析技巧。此外,我们评估了一系列现有的大型语言模型,涵盖开源模型和基于API的模型。我们的实验与分析表明,CPsyExam是提升LLMs对心理学理解的有效的基准测试工具,并支持在不同粒度上对LLMs进行比较。