BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models

Adapting state-of-the-art Large Language Models (LLMs) like GPT-4 and Gemini for specific tasks is challenging. Due to the opacity in their parameters, embeddings, and even output probabilities, existing fine-tuning adaptation methods are inapplicable. Consequently, adapting these black-box LLMs is only possible through their API services, raising concerns about transparency, privacy, and cost. To address these challenges, we introduce BBox-Adapter, a novel lightweight adapter for black-box LLMs. BBox-Adapter distinguishes target and source domain data by treating target data as positive and source data as negative. It employs a ranking-based Noise Contrastive Estimation (NCE) loss to promote the likelihood of target domain data while penalizing that of the source domain. Furthermore, it features an online adaptation mechanism, which incorporates real-time positive data sampling from ground-truth, human, or AI feedback, coupled with negative data from previous adaptations. Extensive experiments demonstrate BBox-Adapter's effectiveness and cost efficiency. It improves model performance by up to 6.77% across diverse tasks and domains, while reducing training and inference costs by 31.30x and 1.84x, respectively.

翻译：针对GPT-4和Gemini等先进大语言模型进行特定任务适配具有挑战性。由于其参数、嵌入层乃至输出概率均不透明，现有的微调适配方法无法适用。因此，只能通过API服务来适配这些黑盒大语言模型，这引发了关于透明度、隐私和成本的担忧。为应对这些挑战，我们提出了BBox-Adapter——一种面向黑盒大语言模型的新型轻量化适配器。BBox-Adapter通过将目标域数据视为正样本、源域数据视为负样本，从而区分目标域与源域数据。它采用基于排序的噪声对比估计损失函数，以提升目标域数据的似然概率，同时抑制源域数据的似然。此外，该适配器具备在线适应机制，能够整合来自真实标注、人类反馈或AI反馈的实时正样本采样，并结合先前适配过程中的负样本数据。大量实验证明了BBox-Adapter的有效性与成本效益。该方法在多种任务和领域中最高可提升模型性能6.77%，同时将训练与推理成本分别降低至原来的1/31.30和1/1.84。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日