This paper presents an innovative framework that integrates Large Language Models (LLMs) with an external Thinker module to enhance the reasoning capabilities of LLM-based agents. Unlike augmenting LLMs with prompt engineering, Thinker directly harnesses knowledge from databases and employs various optimization techniques. The framework forms a reasoning hierarchy where LLMs handle intuitive System-1 tasks such as natural language processing, while the Thinker focuses on cognitive System-2 tasks that require complex logical analysis and domain-specific knowledge. Our framework is presented using a 9-player Werewolf game that demands dual-system reasoning. We introduce a communication protocol between LLMs and the Thinker, and train the Thinker using data from 18800 human sessions and reinforcement learning. Experiments demonstrate the framework's effectiveness in deductive reasoning, speech generation, and online game evaluation. Additionally, we fine-tune a 6B LLM to surpass GPT4 when integrated with the Thinker. This paper also contributes the largest dataset for social deduction games to date.
翻译:摘要:本文提出了一种创新框架,将大语言模型(LLMs)与外部思考器模块集成,以增强基于LLM的智能体的推理能力。与通过提示工程增强LLMs不同,思考器直接利用数据库中的知识并采用多种优化技术。该框架形成了推理层次结构,其中LLM处理直观的系统1任务(如自然语言处理),而思考器专注于需要复杂逻辑分析和领域特定知识的认知系统2任务。我们的框架在需要双系统推理的9人狼人游戏中呈现。我们引入了LLMs与思考器之间的通信协议,并使用来自18800场人类游戏会话和强化学习的数据训练思考器。实验证明了该框架在演绎推理、语音生成和在线游戏评估方面的有效性。此外,我们微调了一个6B的LLM,使其在与思考器集成时超越GPT4。本文还贡献了迄今为止最大的社交推理游戏数据集。