Black-box Prompt Learning for Pre-trained Language Models

The increasing scale of general-purpose Pre-trained Language Models (PLMs) necessitates the study of more efficient adaptation across different downstream tasks. In this paper, we establish a Black-box Discrete Prompt Learning (BDPL) to resonate with pragmatic interactions between the cloud infrastructure and edge devices. Particularly, instead of fine-tuning the model in the cloud, we adapt PLMs by prompt learning, which efficiently optimizes only a few parameters of the discrete prompts. Moreover, we consider the scenario that we do not have access to the parameters and gradients of the pre-trained models, except for its outputs given inputs. This black-box setting secures the cloud infrastructure from potential attack and misuse to cause a single-point failure, which is preferable to the white-box counterpart by current infrastructures. Under this black-box constraint, we apply a variance-reduced policy gradient algorithm to estimate the gradients of parameters in the categorical distribution of each discrete prompt. In light of our method, the user devices can efficiently tune their tasks by querying the PLMs bounded by a range of API calls. Our experiments on RoBERTa and GPT-3 demonstrate that the proposed algorithm achieves significant improvement on eight benchmarks in a cloud-device collaboration manner. Finally, we conduct in-depth case studies to comprehensively analyze our method in terms of various data sizes, prompt lengths, training budgets, optimization objectives, prompt transferability, and explanations of the learned prompts. Our code will be available at https://github.com/shizhediao/Black-Box-Prompt-Learning.

翻译：通用预训练语言模型规模的日益增大，亟需研究更高效的跨下游任务适配方法。本文提出一种黑盒离散提示学习（BDPL）框架，以契合云基础设施与边缘设备之间的实际交互需求。具体而言，我们摒弃传统云端模型微调范式，转而采用提示学习策略适配预训练语言模型——仅需优化离散提示中的少量参数即可实现高效适配。此外，我们考虑更严苛的约束场景：除给定输入对应的输出外，研究者无法获取预训练模型的参数与梯度信息。这种黑盒设置能有效防范云基础设施遭受攻击与滥用导致的单点故障风险，相较于当前基础设施中常用的白盒方案更具安全性。在此黑盒约束下，我们采用方差缩减策略梯度算法，对每个离散提示的类别分布参数梯度进行估计。基于该方法，用户设备可通过限定次数的API调用查询预训练语言模型，实现任务高效调优。在RoBERTa和GPT-3上的实验表明，所提算法在云-端协同场景下对八项基准任务取得显著性能提升。最后，我们通过深度案例分析，从数据规模、提示长度、训练预算、优化目标、提示可迁移性及学习提示的可解释性等维度全面剖析该方法。相关代码已开源至https://github.com/shizhediao/Black-Box-Prompt-Learning。