CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels

Utilizing language models (LMs) without internal access is becoming an attractive paradigm in the field of NLP as many cutting-edge LMs are released through APIs and boast a massive scale. The de-facto method in this type of black-box scenario is known as prompting, which has shown progressive performance enhancements in situations where data labels are scarce or unavailable. Despite their efficacy, they still fall short in comparison to fully supervised counterparts and are generally brittle to slight modifications. In this paper, we propose Clustering-enhanced Linear Discriminative Analysis, a novel approach that improves the text classification accuracy with a very weak-supervision signal (i.e., name of the labels). Our framework draws a precise decision boundary without accessing weights or gradients of the LM model or data labels. The core ideas of CELDA are twofold: (1) extracting a refined pseudo-labeled dataset from an unlabeled dataset, and (2) training a lightweight and robust model on the top of LM, which learns an accurate decision boundary from an extracted noisy dataset. Throughout in-depth investigations on various datasets, we demonstrated that CELDA reaches new state-of-the-art in weakly-supervised text classification and narrows the gap with a fully-supervised model. Additionally, our proposed methodology can be applied universally to any LM and has the potential to scale to larger models, making it a more viable option for utilizing large LMs.

翻译：利用无法获取内部访问权限的语言模型（LM）正成为NLP领域一种有吸引力的范式，因为许多尖端LM通过API发布且规模庞大。在此类黑箱场景中，事实标准方法称为提示学习，它在数据标签稀缺或不可用的情况下展现出渐进式性能提升。尽管提示学习有效，但其性能仍逊于全监督方法，且对细微改动普遍敏感。本文提出聚类增强线性判别分析，这是一种新颖方法，通过极弱监督信号（即标签名称）提升文本分类精度。我们的框架无需访问LM模型的权重、梯度或数据标签即可绘制精确决策边界。CELDA的核心思想有两方面：(1) 从无标签数据集中提取精炼的伪标签数据集，(2) 在LM之上训练轻量级鲁棒模型，该模型能从含噪声数据集中学习精确决策边界。通过对多个数据集的深入探究，我们证明CELDA在弱监督文本分类中达到了新的最优性能，并缩小了与全监督模型的差距。此外，所提方法可通用适配任何LM，并具备向更大模型扩展的潜力，从而成为利用大型LM的更可行方案。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日