Can Large Language Models (LLMs) simulate humans in making important decisions? Recent research has unveiled the potential of using LLMs to develop role-playing language agents (RPLAs), mimicking mainly the knowledge and tones of various characters. However, imitative decision-making necessitates a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,462 characters' decision points from 388 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and RPLA methodologies. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains. Hence, we further propose the CHARMAP method, which adopts persona-based memory retrieval and significantly advances RPLAs on this task, achieving 5.03% increase in accuracy.
翻译:大型语言模型(LLM)能否模拟人类做出重要决策?近期研究揭示了利用LLM开发角色扮演语言代理(RPLA)的潜力,这些代理主要模仿各类角色的知识背景与语言风格。然而,模仿性决策需要对人物设定有更细致的理解。本文对LLM在基于人物设定的决策能力进行了基准测试。具体而言,我们探究LLM能否根据高质量小说中的前置故事情节预测角色的决策。借助文学专家撰写的角色分析,我们构建了包含388本书中1,462个角色决策点的LIFECHOICE数据集。随后,我们采用多种LLM与RPLA方法在LIFECHOICE上进行了全面实验。结果表明,前沿LLM在此任务中展现出有前景的能力,但仍存在显著改进空间。为此,我们进一步提出CHARMAP方法,该方法采用基于人物设定的记忆检索机制,显著提升了RPLA在此任务上的表现,实现了5.03%的准确率提升。