We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs) inspired by human meta-reasoning. Traditional in-context learning-based reasoning techniques, such as Tree-of-Thoughts, show promise but lack consistent state-of-the-art performance across diverse tasks due to their specialized nature. MRP addresses this limitation by guiding LLMs to dynamically select and apply different reasoning methods based on the specific requirements of each task, optimizing both performance and computational efficiency. With MRP, LLM reasoning operates in two phases. Initially, the LLM identifies the most appropriate reasoning method using task input cues and objective descriptions of available methods. Subsequently, it applies the chosen method to complete the task. This dynamic strategy mirrors human meta-reasoning, allowing the model to excel in a wide range of problem domains. We evaluate the effectiveness of MRP through comprehensive benchmarks. The results demonstrate that MRP achieves or approaches state-of-the-art performance across diverse tasks. MRP represents a significant advancement in enabling LLMs to identify cognitive challenges across problems and leverage benefits across different reasoning approaches, enhancing their ability to handle diverse and complex problem domains efficiently. Every LLM deserves a Meta-Reasoning Prompting to unlock its full potential and ensure adaptability in an ever-evolving landscape of challenges and applications.
翻译:我们提出元推理提示(MRP),一种受人类元推理启发、针对大语言模型(LLM)的新型高效系统提示方法。传统的基于上下文学习的推理技术(如思维树)虽展现出潜力,但由于其专门化特性,在不同任务中缺乏持续的最先进性能。MRP通过引导LLM根据每个任务的具体需求动态选择并应用不同的推理方法,优化了性能与计算效率,从而解决了这一局限。采用MRP时,LLM推理分两个阶段进行:首先,LLM利用任务输入线索和可用方法的客观描述,识别最合适的推理方法;随后,应用所选方法完成任务。这种动态策略模拟了人类元推理过程,使模型能够在广泛的问题领域中表现出色。我们通过综合基准测试评估了MRP的有效性。结果表明,MRP在不同任务中均达到或接近最先进性能。MRP代表了LLM能力的重要进步,使其能够识别问题中的认知挑战并综合利用不同推理方法的优势,从而提升高效处理多样复杂问题领域的能力。每个LLM都应配备元推理提示,以释放其全部潜力,确保在不断演变的挑战与应用场景中保持适应性。