Adaptive Meta-learner via Gradient Similarity for Few-shot Text Classification

Few-shot text classification aims to classify the text under the few-shot scenario. Most of the previous methods adopt optimization-based meta learning to obtain task distribution. However, due to the neglect of matching between the few amount of samples and complicated models, as well as the distinction between useful and useless task features, these methods suffer from the overfitting issue. To address this issue, we propose a novel Adaptive Meta-learner via Gradient Similarity (AMGS) method to improve the model generalization ability to a new task. Specifically, the proposed AMGS alleviates the overfitting based on two aspects: (i) acquiring the potential semantic representation of samples and improving model generalization through the self-supervised auxiliary task in the inner loop, (ii) leveraging the adaptive meta-learner via gradient similarity to add constraints on the gradient obtained by base-learner in the outer loop. Moreover, we make a systematic analysis of the influence of regularization on the entire framework. Experimental results on several benchmarks demonstrate that the proposed AMGS consistently improves few-shot text classification performance compared with the state-of-the-art optimization-based meta-learning approaches.

翻译：小样本文本分类旨在解决小样本场景下的文本分类问题。以往大多数方法采用基于优化的元学习来获取任务分布。然而，由于忽视了少量样本与复杂模型之间的匹配，以及有用与无用任务特征的区分，这些方法存在过拟合问题。为解决此问题，我们提出了一种新颖的基于梯度相似性的自适应元学习器（AMGS）方法，以提升模型对新任务的泛化能力。具体而言，所提出的AMGS从两个方面缓解过拟合：（i）在内循环中通过自监督辅助任务获取样本的潜在语义表示并提升模型泛化性，（ii）在外循环中利用基于梯度相似性的自适应元学习器对基学习器获取的梯度施加约束。此外，我们对正则化在整个框架中的影响进行了系统性分析。在多个基准数据集上的实验结果表明，与当前最先进的基于优化的元学习方法相比，所提出的AMGS能够持续提升小样本文本分类性能。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日