The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically learning suitable prompts. However, while many effective methods have been proposed, the cost incurred during the learning process (e.g., accessing LLM and evaluating the responses) has not been considered. To overcome this limitation, this work explicitly incorporates a finite budget constraint into prompt learning. Towards developing principled solutions, a novel connection is established between prompt learning and fixed-budget best arm identification (BAI-FB) in multi-armed bandits (MAB). Based on this connection, a general framework TRIPLE (besT aRm Identification for Prompt LEarning) is proposed to harness the power of BAI-FB in prompt learning systematically. Unique characteristics of prompt learning further lead to two embedding-based enhancements of TRIPLE by exploiting the ideas of clustering and function approximation. Extensive experiments on multiple well-adopted tasks using both GPT 3.5 and Llama2 demonstrate the significant performance improvement of TRIPLE over the previous baselines while satisfying the limited budget constraints.
翻译:大型语言模型(LLMs)卓越的指令遵循能力引发了自动学习合适提示的日益关注。然而,尽管已提出许多有效方法,但学习过程中的成本(例如,访问LLM和评估响应)尚未被考虑。为克服这一局限,本文明确将有限预算约束引入提示学习。为开发原则性解决方案,本文在提示学习与多臂老虎机(MAB)中的固定预算最佳臂识别(BAI-FB)之间建立了新颖联系。基于这一联系,提出了通用框架TRIPLE(besT aRm Identification for Prompt LEarning),以系统性地利用BAI-FB在提示学习中的能力。提示学习的独特特性进一步促成了TRIPLE的两种基于嵌入的增强,分别利用聚类和函数逼近的思想。在多个广泛采用的任务上,使用GPT 3.5和Llama2进行的大量实验表明,TRIPLE在满足有限预算约束的同时,相较于先前基线方法取得了显著的性能提升。