The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically finding good prompts, i.e., prompt optimization. Most existing works follow the scheme of selecting from a pre-generated pool of candidate prompts. However, these designs mainly focus on the generation strategy, while limited attention has been paid to the selection method. Especially, the cost incurred during the selection (e.g., accessing LLM and evaluating the responses) is rarely explicitly considered. To overcome this limitation, this work provides a principled framework, TRIPLE, to efficiently perform prompt selection under an explicit budget constraint. TRIPLE is built on a novel connection established between prompt optimization and fixed-budget best arm identification (BAI-FB) in multi-armed bandits (MAB); thus, it is capable of leveraging the rich toolbox from BAI-FB systematically and also incorporating unique characteristics of prompt optimization. Extensive experiments on multiple well-adopted tasks using various LLMs demonstrate the remarkable performance improvement of TRIPLE over baselines while satisfying the limited budget constraints. As an extension, variants of TRIPLE are proposed to efficiently select examples for few-shot prompts, also achieving superior empirical performance.
翻译:大型语言模型(LLM)卓越的指令跟随能力激发了人们对自动寻找优质提示(即提示优化)日益增长的研究兴趣。现有工作大多遵循从预生成的候选提示池中进行选择的方案。然而,这些设计主要关注生成策略,对选择方法的关注有限。特别是在选择过程中产生的成本(例如访问LLM和评估响应)很少被明确考量。为克服这一局限,本研究提出了一个原则性框架TRIPLE,用于在显式预算约束下高效执行提示选择。TRIPLE建立于提示优化与多臂老虎机(MAB)中固定预算最佳臂识别(BAI-FB)的新颖联系之上;因此,它能够系统性地利用BAI-FB的丰富工具箱,同时融入提示优化的独特特性。通过在多种广泛采用的任务上使用不同LLM进行的大量实验表明,TRIPLE在满足有限预算约束的同时,相比基线方法实现了显著的性能提升。作为扩展,本文提出了TRIPLE的变体以高效选择少样本提示的示例,同样取得了优越的实证性能。