Large Language Models (LLMs) are often misleadingly recognized as having a personality or a set of values. We argue that an LLM can be seen as a superposition of perspectives with different values and personality traits. LLMs exhibit context-dependent values and personality traits that change based on the induced perspective (as opposed to humans, who tend to have more coherent values and personality traits across contexts). We introduce the concept of perspective controllability, which refers to a model's affordance to adopt various perspectives with differing values and personality traits. In our experiments, we use questionnaires from psychology (PVQ, VSM, IPIP) to study how exhibited values and personality traits change based on different perspectives. Through qualitative experiments, we show that LLMs express different values when those are (implicitly or explicitly) implied in the prompt, and that LLMs express different values even when those are not obviously implied (demonstrating their context-dependent nature). We then conduct quantitative experiments to study the controllability of different models (GPT-4, GPT-3.5, OpenAssistant, StableVicuna, StableLM), the effectiveness of various methods for inducing perspectives, and the smoothness of the models' drivability. We conclude by examining the broader implications of our work and outline a variety of associated scientific questions. The project website is available at https://sites.google.com/view/llm-superpositions .
翻译:大型语言模型(LLMs)常被误认为具有个性或一套价值观。我们提出,LLM可被视为不同价值观与人格特质视角的叠加态。LLM展现出随诱导视角变化的情境依赖型价值观与人格特质(这与人类不同——人类在不同情境中通常持有更一致的价值观与人格特质)。我们引入“视角可控性”概念,指模型采纳具有不同价值观与人格特质的多重视角的能力。实验中,我们采用心理学问卷(PVQ基础价值观量表、VSM文化价值观量表、IPIP人格量表)研究不同视角下展现的价值观与人格特质变化。通过定性实验表明:当提示词中(显性或隐性)隐含价值观时,LLM会表达不同价值观;即使未明显隐含时,LLM也会表达不同价值观(展现其情境依赖特性)。随后开展定量实验,研究不同模型(GPT-4、GPT-3.5、OpenAssistant、StableVicuna、StableLM)的可控性、多种视角诱导方法的有效性,以及模型驱动的平滑性。最后探讨本研究的广泛影响,并概述相关科学问题。项目网站访问地址:https://sites.google.com/view/llm-superpositions