Recent advances in large language models (LLMs) demonstrate that their capabilities are comparable, or even superior, to humans in many tasks in natural language processing. Despite this progress, LLMs are still inadequate at social-cognitive reasoning, which humans are naturally good at. Drawing inspiration from psychological research on the links between certain personality traits and Theory-of-Mind (ToM) reasoning, and from prompt engineering research on the hyper-sensitivity of prompts in affecting LLMs capabilities, this study investigates how inducing personalities in LLMs using prompts affects their ToM reasoning capabilities. Our findings show that certain induced personalities can significantly affect the LLMs' reasoning capabilities in three different ToM tasks. In particular, traits from the Dark Triad have a larger variable effect on LLMs like GPT-3.5, Llama 2, and Mistral across the different ToM tasks. We find that LLMs that exhibit a higher variance across personality prompts in ToM also tends to be more controllable in personality tests: personality traits in LLMs like GPT-3.5, Llama 2 and Mistral can be controllably adjusted through our personality prompts. In today's landscape where role-play is a common strategy when using LLMs, our research highlights the need for caution, as models that adopt specific personas with personalities potentially also alter their reasoning abilities in an unexpected manner.
翻译:近年来,大语言模型的进步表明,其在自然语言处理的许多任务中,能力可与人类媲美甚至超越人类。尽管取得进展,LLMs在社会认知推理方面仍存在不足,而这是人类天然擅长的能力。本研究借鉴心理学中关于特定人格特质与心智推理(ToM)关联的研究,以及提示工程中提示词超敏感性影响LLMs能力的发现,探究通过提示词向LLMs注入人格特质会如何影响其ToM推理能力。我们的结果表明,某些注入的人格特质能显著影响LLMs在三种不同ToM任务中的推理能力。特别地,暗黑三人格特质对GPT-3.5、Llama 2和Mistral等LLMs的影响在不同ToM任务中表现出较大变异性。我们发现,在ToM任务中对人格提示词表现出更高方差性的LLMs,往往在人格测试中也更可控:GPT-3.5、Llama 2和Mistral等LLMs的人格特质可通过我们的人格提示词进行可控调节。在当前将角色扮演作为LLMs常用策略的背景下,我们的研究强调了需要谨慎——采用特定人格角色模型的模型,可能以意想不到的方式改变其推理能力。