Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

We introduce meta-prompting, an effective scaffolding technique designed to enhance the functionality of language models (LMs). This approach transforms a single LM into a multi-faceted conductor, adept at managing and integrating multiple independent LM queries. By employing high-level instructions, meta-prompting guides the LM to break down complex tasks into smaller, more manageable subtasks. These subtasks are then handled by distinct "expert" instances of the same LM, each operating under specific, tailored instructions. Central to this process is the LM itself, in its role as the conductor, which ensures seamless communication and effective integration of the outputs from these expert models. It additionally employs its inherent critical thinking and robust verification processes to refine and authenticate the end result. This collaborative prompting approach empowers a single LM to simultaneously act as a comprehensive orchestrator and a panel of diverse experts, significantly enhancing its performance across a wide array of tasks. The zero-shot, task-agnostic nature of meta-prompting greatly simplifies user interaction by obviating the need for detailed, task-specific instructions. Furthermore, our research demonstrates the seamless integration of external tools, such as a Python interpreter, into the meta-prompting framework, thereby broadening its applicability and utility. Through rigorous experimentation with GPT-4, we establish the superiority of meta-prompting over conventional scaffolding methods: When averaged across all tasks, including the Game of 24, Checkmate-in-One, and Python Programming Puzzles, meta-prompting, augmented with a Python interpreter functionality, surpasses standard prompting by 17.1%, expert (dynamic) prompting by 17.3%, and multipersona prompting by 15.2%.

翻译：我们提出一种高效的脚手架技术——元提示，旨在增强语言模型（LM）的功能。该方法将单一语言模型转化为多面协调者，能够有效管理和整合多个独立的语言模型查询。通过采用高级指令，元提示引导语言模型将复杂任务分解为更小、更易管理的子任务。这些子任务随后由同一语言模型的多个专用"专家"实例处理，每个实例均遵循特定定制指令。在此过程中，语言模型自身作为协调者，确保这些专家模型输出之间的无缝通信与有效整合。此外，它利用其固有的批判性思维和稳健的验证流程来精炼和验证最终结果。这种协作式提示方法使单一语言模型能够同时充当全面编排者和多样化专家小组，显著提升其在广泛任务中的性能。元提示的零样本、任务无关特性通过消除对详细任务特定指令的需求，极大简化了用户交互。进一步地，我们的研究展示了外部工具（如Python解释器）与元提示框架的无缝集成，从而拓展了其适用性和实用性。通过使用GPT-4进行的严格实验，我们确立了元提示相较于传统脚手架方法的优越性：在所有任务（包括"24点"游戏、"一步将杀"谜题和Python编程难题）的平均表现中，集成Python解释器功能的元提示比标准提示方法提升17.1%，超越专家（动态）提示方法17.3%，并优于多角色提示方法15.2%。