Identifying Multiple Personalities in Large Language Models with External Evaluation

As Large Language Models (LLMs) are integrated with human daily applications rapidly, many societal and ethical concerns are raised regarding the behavior of LLMs. One of the ways to comprehend LLMs' behavior is to analyze their personalities. Many recent studies quantify LLMs' personalities using self-assessment tests that are created for humans. Yet many critiques question the applicability and reliability of these self-assessment tests when applied to LLMs. In this paper, we investigate LLM personalities using an alternate personality measurement method, which we refer to as the external evaluation method, where instead of prompting LLMs with multiple-choice questions in the Likert scale, we evaluate LLMs' personalities by analyzing their responses toward open-ended situational questions using an external machine learning model. We first fine-tuned a Llama2-7B model as the MBTI personality predictor that outperforms the state-of-the-art models as the tool to analyze LLMs' responses. Then, we prompt the LLMs with situational questions and ask them to generate Twitter posts and comments, respectively, in order to assess their personalities when playing two different roles. Using the external personality evaluation method, we identify that the obtained personality types for LLMs are significantly different when generating posts versus comments, whereas humans show a consistent personality profile in these two different situations. This shows that LLMs can exhibit different personalities based on different scenarios, thus highlighting a fundamental difference between personality in LLMs and humans. With our work, we call for a re-evaluation of personality definition and measurement in LLMs.

翻译：随着大语言模型迅速融入人类日常应用，其行为引发了诸多社会与伦理层面的关注。理解大语言模型行为的一种途径是分析其人格特质。近期众多研究采用面向人类设计的自评量表来量化大语言模型的人格特征，然而诸多批评质疑这些自评量表应用于大语言模型时的适用性与可靠性。本文采用一种替代性人格测量方法——外部评估法，即不再以李克特量表形式向大语言模型呈现多项选择题，而是通过分析其对开放式情境问题的回应，并借助外部机器学习模型进行评估。我们首先微调了Llama2-7B模型作为MBTI人格预测器，该工具在分析大语言模型回应方面超越现有最优模型。随后通过向大语言模型提供情境问题，分别要求其生成推文与评论，以评估其在扮演两种角色时的人格特征。利用外部人格评估方法，我们发现大语言模型在生成推文与评论时呈现出显著不同的人格类型，而人类在这两种情境下则展现出一致的人格轮廓。这表明大语言模型会根据不同场景表现出迥异的人格特质，从而凸显了其与人类在人格本质上的根本差异。本研究呼吁学界重新审视大语言模型中人格的定义与测量方式。