Large language models (LLMs) excel in many tasks in 2023, but they still face challenges in complex reasoning. Theory-of-mind (ToM) tasks, which require understanding agents' beliefs, goals, and mental states, are essential for common-sense reasoning involving humans, making it crucial to enhance LLM performance in this area. This study measures the ToM performance of GPT-4 and three GPT-3.5 variants (Davinci-2, Davinci-3, GPT-3.5-Turbo), and investigates the effectiveness of in-context learning in improving their ToM comprehension. We evaluated prompts featuring two-shot chain of thought reasoning and step-by-step thinking instructions. We found that LLMs trained with Reinforcement Learning from Human Feedback (RLHF) (all models excluding Davinci-2) improved their ToM accuracy via in-context learning. GPT-4 performed best in zero-shot settings, reaching nearly 80% ToM accuracy, but still fell short of the 87% human accuracy on the test set. However, when supplied with prompts for in-context learning, all RLHF-trained LLMs exceeded 80% ToM accuracy, with GPT-4 reaching 100%. These results demonstrate that appropriate prompting enhances LLM ToM reasoning, and they underscore the context-dependent nature of LLM cognitive capacities.
翻译:大型语言模型在2023年的许多任务中表现出色,但在复杂推理方面仍面临挑战。心理理论任务要求理解智能体(agent)的信念、目标和心理状态,对于涉及人类的常识推理至关重要,因此提升语言模型在此领域的能力具有重要意义。本研究测量了GPT-4及三种GPT-3.5变体(Davinci-2、Davinci-3、GPT-3.5-Turbo)的心理理论表现,并探讨了上下文学习在提升其心理理论理解方面的有效性。我们评估了包含两样本思维链推理和逐步思考指令的提示。结果发现,通过人类反馈强化学习训练的语言模型(除Davinci-2外的所有模型)能够通过上下文学习提升心理理论准确性。GPT-4在零样本设置下表现最佳,心理理论准确率接近80%,但仍低于测试集上人类87%的准确率。然而,当提供上下文学习提示时,所有经过人类反馈强化学习训练的语言模型心理理论准确率均超过80%,其中GPT-4达到100%。这些结果表明,适当的提示能够增强语言模型的心理理论推理能力,并强调了语言模型认知能力的上下文依赖性。