The effectiveness of contrastive learning technology in natural language processing tasks is yet to be explored and analyzed. How to construct positive and negative samples correctly and reasonably is the core challenge of contrastive learning. It is even harder to discover contrastive objects in multi-label text classification tasks. There are very few contrastive losses proposed previously. In this paper, we investigate the problem from a different angle by proposing five novel contrastive losses for multi-label text classification tasks. These are Strict Contrastive Loss (SCL), Intra-label Contrastive Loss (ICL), Jaccard Similarity Contrastive Loss (JSCL), Jaccard Similarity Probability Contrastive Loss (JSPCL), and Stepwise Label Contrastive Loss (SLCL). We explore the effectiveness of contrastive learning for multi-label text classification tasks by the employment of these novel losses and provide a set of baseline models for deploying contrastive learning techniques on specific tasks. We further perform an interpretable analysis of our approach to show how different components of contrastive learning losses play their roles. The experimental results show that our proposed contrastive losses can bring improvement to multi-label text classification tasks. Our work also explores how contrastive learning should be adapted for multi-label text classification tasks.
翻译:对比学习技术在自然语言处理任务中的有效性仍有待探索与分析。如何正确且合理地构建正负样本是对比学习的核心挑战,而在多标签文本分类任务中发现对比对象则更加困难。先前提出的对比损失方法极为有限。本文从不同角度研究该问题,提出了五种面向多标签文本分类任务的新型对比损失函数,包括:严格对比损失(SCL)、标签内对比损失(ICL)、杰卡德相似性对比损失(JSCL)、杰卡德相似性概率对比损失(JSPCL)以及逐步标签对比损失(SLCL)。我们通过采用这些新型损失函数探索了对比学习在多标签文本分类任务中的有效性,并为在特定任务中部署对比学习技术提供了一组基线模型。进一步地,我们对所提方法进行了可解释性分析,以展示对比学习损失中不同组成部分的作用机制。实验结果表明,我们提出的对比损失函数能够提升多标签文本分类任务的性能。本工作还探讨了对比学习应如何适配多标签文本分类任务。