Can ChatGPT Assess Human Personalities? A General Evaluation Framework

Large Language Models (LLMs) especially ChatGPT have produced impressive results in various areas, but their potential human-like psychology is still largely unexplored. Existing works study the virtual personalities of LLMs but rarely explore the possibility of analyzing human personalities via LLMs. This paper presents a generic evaluation framework for LLMs to assess human personalities based on Myers Briggs Type Indicator (MBTI) tests. Specifically, we first devise unbiased prompts by randomly permuting options in MBTI questions and adopt the average testing result to encourage more impartial answer generation. Then, we propose to replace the subject in question statements to enable flexible queries and assessments on different subjects from LLMs. Finally, we re-formulate the question instructions in a manner of correctness evaluation to facilitate LLMs to generate clearer responses. The proposed framework enables LLMs to flexibly assess personalities of different groups of people. We further propose three evaluation metrics to measure the consistency, robustness, and fairness of assessment results from state-of-the-art LLMs including ChatGPT and GPT-4. Our experiments reveal ChatGPT's ability to assess human personalities, and the average results demonstrate that it can achieve more consistent and fairer assessments in spite of lower robustness against prompt biases compared with InstructGPT.

翻译：大型语言模型（LLMs），尤其是ChatGPT，已在多个领域展现出显著成果，但其潜在类人心理学特性仍尚待深入探索。现有研究主要关注LLMs的虚拟人格，鲜少探讨通过LLMs分析人类性格的可能性。本文提出一个基于迈尔斯-布里格斯类型指标（MBTI）测试的通用评估框架，用于LLMs对人类性格进行评估。具体而言，我们首先通过随机排列MBTI问题选项来设计无偏提示，并采用平均测试结果以促进更中立的答案生成。其次，我们提出替换问题陈述中的主体，使LLMs能够对不同对象进行灵活查询与评估。最后，我们将问题指令重构为正确性评估形式，以促使LLMs生成更清晰的响应。该框架使LLMs能够灵活评估不同人群的性格特征。我们进一步提出三种评估指标，用于衡量包括ChatGPT和GPT-4在内的先进LLMs评估结果的一致性、鲁棒性和公平性。实验揭示了ChatGPT评估人类性格的能力，其平均结果表明：尽管相较于InstructGPT，ChatGPT对提示偏差的鲁棒性较低，但能实现更一致和更公平的评估。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日