As an effective tool for eliciting the power of Large Language Models (LLMs), prompting has recently demonstrated unprecedented abilities across a variety of complex tasks. To further improve the performance, prompt ensemble has attracted substantial interest for tackling the hallucination and instability of LLMs. However, existing methods usually adopt a two-stage paradigm, which requires a pre-prepared set of prompts with substantial manual effort, and is unable to perform directed optimization for different weak learners. In this paper, we propose a simple, universal, and automatic method named PREFER (Pompt Ensemble learning via Feedback-Reflect-Refine) to address the stated limitations. Specifically, given the fact that weak learners are supposed to focus on hard examples during boosting, PREFER builds a feedback mechanism for reflecting on the inadequacies of existing weak learners. Based on this, the LLM is required to automatically synthesize new prompts for iterative refinement. Moreover, to enhance stability of the prompt effect evaluation, we propose a novel prompt bagging method involving forward and backward thinking, which is superior to majority voting and is beneficial for both feedback and weight calculation in boosting. Extensive experiments demonstrate that our PREFER achieves state-of-the-art performance in multiple types of tasks by a significant margin. We have made our code publicly available.
翻译:作为激发大语言模型潜力的有效工具,提示学习近期在多种复杂任务中展现出前所未有的能力。为进一步提升性能,提示集成方法因能应对大语言模型的幻觉与不稳定性而引发广泛关注。然而现有方法通常采用两阶段范式,不仅需要预先准备大量人工设计的提示集,且无法针对不同弱学习器进行定向优化。本文提出一种简单、通用且自动化的方法PREFER(基于反馈-反思-优化的提示集成学习),以解决上述局限性。具体而言,基于弱学习器应聚焦于增强学习过程中困难样本这一事实,PREFER构建了反馈机制以反思现有弱学习器的不足,并据此要求大语言模型自动合成新提示进行迭代优化。此外,为增强提示效果评估的稳定性,我们提出包含正向与反向思维的新型提示袋装方法,该方法优于多数投票机制,并能同时优化反馈机制与权重计算过程。大量实验表明,PREFER在多种类型任务中均以显著优势取得最优性能。我们的代码已开源发布。