Automatic prompt engineering aims to enhance the generation quality of large language models (LLMs). Recent works utilize feedbacks generated from erroneous cases to guide the prompt optimization. During inference, they may further retrieve several semantically-related exemplars and concatenate them to the optimized prompts to improve the performance. However, those works only utilize the feedback at the current step, ignoring historical and unseleccted feedbacks which are potentially beneficial. Moreover, the selection of exemplars only considers the general semantic relationship and may not be optimal in terms of task performance and matching with the optimized prompt. In this work, we propose an Exemplar-Guided Reflection with Memory mechanism (ERM) to realize more efficient and accurate prompt optimization. Specifically, we design an exemplar-guided reflection mechanism where the feedback generation is additionally guided by the generated exemplars. We further build two kinds of memory to fully utilize the historical feedback information and support more effective exemplar retrieval. Empirical evaluations show our method surpasses previous state-of-the-arts with less optimization steps, i.e., improving F1 score by 10.1 on LIAR dataset, and reducing half of the optimization steps on ProTeGi.
翻译:自动提示工程旨在提升大型语言模型(LLM)的生成质量。近期研究利用错误案例生成的反馈来指导提示优化。在推理过程中,它们可能进一步检索若干语义相关的示例,并将其与优化后的提示拼接以提升性能。然而,这些工作仅利用了当前步骤的反馈,忽略了历史上未被选中的、可能具有潜在价值的反馈信息。此外,示例的选择仅考虑了一般的语义关系,在任务性能及与优化提示的匹配度方面可能并非最优。本文提出一种带记忆机制的示例引导反思方法(ERM),以实现更高效、更准确的提示优化。具体而言,我们设计了一种示例引导的反思机制,其中反馈的生成额外受到所生成示例的引导。我们进一步构建了两种记忆模块,以充分利用历史反馈信息,并支持更有效的示例检索。实证评估表明,我们的方法以更少的优化步骤超越了先前的最优方法,即在LIAR数据集上将F1分数提升了10.1分,并在ProTeGi上减少了一半的优化步骤。