Recent breakthroughs in large language models (LLMs) have brought remarkable success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information processed by LLMs is consistently honest, neglecting the pervasive deceptive or misleading information in human society and AI-generated content. This oversight makes LLMs susceptible to malicious manipulations, potentially resulting in detrimental outcomes. This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments. Avalon, full of misinformation and requiring sophisticated logic, manifests as a "Game-of-Thoughts". Inspired by the efficacy of humans' recursive thinking and perspective-taking in the Avalon game, we introduce a novel framework, Recursive Contemplation (ReCon), to enhance LLMs' ability to identify and counteract deceptive information. ReCon combines formulation and refinement contemplation processes; formulation contemplation produces initial thoughts and speech, while refinement contemplation further polishes them. Additionally, we incorporate first-order and second-order perspective transitions into these processes respectively. Specifically, the first-order allows an LLM agent to infer others' mental states, and the second-order involves understanding how others perceive the agent's mental state. After integrating ReCon with different LLMs, extensive experiment results from the Avalon game indicate its efficacy in aiding LLMs to discern and maneuver around deceptive information without extra fine-tuning and data. Finally, we offer a possible explanation for the efficacy of ReCon and explore the current limitations of LLMs in terms of safety, reasoning, speaking style, and format, potentially furnishing insights for subsequent research.
翻译:近期大语言模型(LLMs)的突破性进展在“LLM即智能体”领域取得了显著成功。然而,一个普遍存在的假设是LLMs处理的信息始终是诚实的,这忽视了人类社会及人工智能生成内容中普遍存在的欺骗性或误导性信息。这一疏漏使LLMs易受恶意操纵,可能导致不利后果。本研究以复杂的阿瓦隆游戏为测试平台,探索LLMs在欺骗性环境中的潜力。阿瓦隆充满误导信息且需要复杂逻辑推理,本质上是一种“思维博弈”。受人类在阿瓦隆游戏中递归思考与观点采择有效性的启发,我们提出了新颖的框架——递归思考(ReCon),以增强LLMs识别并应对欺骗信息的能力。ReCon结合了公式化思考与精炼思考过程:公式化思考生成初始想法与语言表达,精炼思考则进一步优化这些内容。此外,我们在上述过程中分别融入了第一阶与第二阶视角转换。具体而言,第一阶视角使LLM智能体能够推断他人的心理状态,第二阶视角则涉及理解他人如何感知该智能体的心理状态。将ReCon与不同LLMs集成后,来自阿瓦隆游戏的大量实验结果表明,该框架能有效帮助LLMs在无需额外微调与数据的情况下,识别并规避欺骗信息。最后,我们为ReCon的有效性提供了可能的解释,并探讨了LLMs在安全性、推理、语言风格及格式方面的现有局限性,从而为后续研究提供潜在启示。