Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning

Prompt-based learning has been an effective paradigm for large pretrained language models (LLM), enabling few-shot or even zero-shot learning. Black-box prompt search has received growing interest recently for its distinctive properties of gradient-free optimization, proven particularly useful and powerful for model-as-a-service usage. However, the discrete nature and the complexity of combinatorial optimization hinder the efficiency of modern black-box approaches. Despite extensive research on search algorithms, the crucial aspect of search space design and optimization has been largely overlooked. In this paper, we first conduct a sensitivity analysis by prompting LLM, revealing that only a small number of tokens exert a disproportionate amount of influence on LLM predictions. Leveraging this insight, we propose the Clustering and Pruning for Efficient Black-box Prompt Search (ClaPS), a simple black-box search method that first clusters and prunes the search space to focus exclusively on influential prompt tokens. By employing even simple search methods within the pruned search space, ClaPS achieves state-of-the-art performance across various tasks and LLMs, surpassing the performance of complex approaches while significantly reducing search costs. Our findings underscore the critical role of search space design and optimization in enhancing both the usefulness and the efficiency of black-box prompt-based learning.

翻译：提示学习已成为大型预训练语言模型的有效范式，能够实现少样本甚至零样本学习。近年来，黑盒提示搜索因其独特的无梯度优化特性而获得广泛关注，尤其在模型即服务场景中展现出实用性与强大能力。然而，离散性质与组合优化的复杂性阻碍了现代黑盒方法的效率。尽管搜索算法研究已取得大量进展，但搜索空间设计与优化这一关键环节却长期被忽视。本文首先通过对大语言模型进行提示敏感性分析，发现仅有少量token对模型预测产生显著影响。基于此洞察，我们提出基于聚类与剪枝的高效黑盒提示搜索方法，该方法通过聚类与剪枝搜索空间，聚焦于有影响力的提示token。即使在剪枝后的搜索空间中采用简单搜索方法，ClaPS方法在不同任务与大语言模型上均能实现最优性能，在显著降低搜索成本的同时超越复杂方法的表现。我们的研究结果凸显了搜索空间设计与优化在提升黑盒提示学习的实用性与效率中的关键作用。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日