Mask-guided BERT for Few Shot Text Classification

Wenxiong Liao,Zhengliang Liu,Haixing Dai,Zihao Wu,Yiyang Zhang,Xiaoke Huang,Yuzhong Chen,Xi Jiang,Wei Liu,Dajiang Zhu,Tianming Liu,Sheng Li,Xiang Li,Hongmin Cai

Transformer-based language models have achieved significant success in various domains. However, the data-intensive nature of the transformer architecture requires much labeled data, which is challenging in low-resource scenarios (i.e., few-shot learning (FSL)). The main challenge of FSL is the difficulty of training robust models on small amounts of samples, which frequently leads to overfitting. Here we present Mask-BERT, a simple and modular framework to help BERT-based architectures tackle FSL. The proposed approach fundamentally differs from existing FSL strategies such as prompt tuning and meta-learning. The core idea is to selectively apply masks on text inputs and filter out irrelevant information, which guides the model to focus on discriminative tokens that influence prediction results. In addition, to make the text representations from different categories more separable and the text representations from the same category more compact, we introduce a contrastive learning loss function. Experimental results on public-domain benchmark datasets demonstrate the effectiveness of Mask-BERT.

翻译：基于Transformer的语言模型在多个领域取得了显著成功。然而，Transformer架构对数据的高需求导致其需要大量标注数据，这在低资源场景（即小样本学习）中面临挑战。小样本学习的主要挑战在于难以在少量样本上训练出鲁棒模型，这常导致过拟合问题。本文提出Mask-BERT，一个简洁且模块化的框架，用于帮助基于BERT的架构应对小样本学习。该方法与现有的提示调优、元学习等小样本学习策略存在本质区别。其核心思想是对文本输入选择性应用掩码以过滤无关信息，从而引导模型聚焦于影响预测结果的区分性标记。此外，为增强不同类别文本表征的可分离性并提升同类文本表征的紧致性，我们引入了对比学习损失函数。在公开基准数据集上的实验结果验证了Mask-BERT的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日