Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered. Recently, there is a growing trend of applying prompts on pre-trained language models (PLMs), which has exhibited effectiveness in the few-shot flat text classification tasks. However, limited work has studied the paradigm of prompt-based learning in the HTC problem when the training data is extremely scarce. In this work, we define a path-based few-shot setting and establish a strict path-based evaluation metric to further explore few-shot HTC tasks. To address the issue, we propose the hierarchical verbalizer ("HierVerb"), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem at multiple layers and learning vectors as verbalizers constrained by hierarchical structure and hierarchical contrastive learning. In this manner, HierVerb fuses label hierarchy knowledge into verbalizers and remarkably outperforms those who inject hierarchy through graph encoders, maximizing the benefits of PLMs. Extensive experiments on three popular HTC datasets under the few-shot settings demonstrate that prompt with HierVerb significantly boosts the HTC performance, meanwhile indicating an elegant way to bridge the gap between the large pre-trained model and downstream hierarchical classification tasks. Our code and few-shot dataset are publicly available at https://github.com/1KE-JI/HierVerb.

翻译：由于实际应用中标签结构复杂且标注成本高昂，层次文本分类（HTC）在低资源或少样本场景下表现不佳。近年来，基于提示的预训练语言模型（PLMs）方法在少样本平面文本分类任务中展现出显著效果。然而，在训练数据极度稀缺的HTC问题中，基于提示的学习范式研究仍十分有限。本文定义了一种基于路径的少样本设置，并建立严格的路径级评估指标以深入探索少样本HTC任务。为解决该问题，我们提出层次化语言提示器（HierVerb），这是一个多层语言提示器框架，将HTC视为多层次上的单标签或多标签分类问题，并通过层次结构约束与层次对比学习学习向量作为语言提示器。通过这种方式，HierVerb将标签层次知识融入语言提示器，显著优于通过图编码器注入层次结构的方法，最大化利用了预训练语言模型。在三个主流HTC数据集上的少样本实验表明，基于HierVerb的提示方法显著提升了HTC性能，同时为弥合大规模预训练模型与下游层次分类任务之间的差距提供了一条优雅路径。我们的代码和少样本数据集已公开于https://github.com/1KE-JI/HierVerb。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日