Popular prompt strategies like Chain-of-Thought Prompting can dramatically improve the reasoning abilities of Large Language Models (LLMs) in various domains. However, such hand-crafted prompt-strategies are often sub-optimal. In this paper, we present Promptbreeder, a general-purpose self-referential self-improvement mechanism that evolves and adapts prompts for a given domain. Driven by an LLM, Promptbreeder mutates a population of task-prompts, and subsequently evaluates them for fitness on a training set. Crucially, the mutation of these task-prompts is governed by mutation-prompts that the LLM generates and improves throughout evolution in a self-referential way. That is, Promptbreeder is not just improving task-prompts, but it is also improving the mutationprompts that improve these task-prompts. Promptbreeder outperforms state-of-the-art prompt strategies such as Chain-of-Thought and Plan-and-Solve Prompting on commonly used arithmetic and commonsense reasoning benchmarks. Furthermore, Promptbreeder is able to evolve intricate task-prompts for the challenging problem of hate speech classification.
翻译:流行提示策略(如思维链提示)能在多个领域显著提升大语言模型的推理能力。然而,此类人工设计的提示策略往往并非最优。本文提出Promptbreeder——一种通用自指性自改进机制,可针对特定领域进化并适配提示。该机制以大语言模型为驱动,对任务提示群体进行变异,随后在训练集上评估其适应度。关键之处在于,这些任务提示的变异由大语言模型以自指方式生成并贯穿进化过程持续改进的变异提示所调控。即Promptbreeder不仅改进任务提示,同时也在改进优化这些任务提示的变异提示。在常用算术与常识推理基准测试中,Promptbreeder的性能超越思维链提示、规划与求解提示等最先进提示策略。此外,该机制还能针对仇恨言论分类这一挑战性问题,进化出复杂的任务提示。