The effectiveness of contrastive learning technology in natural language processing tasks is yet to be explored and analyzed. How to construct positive and negative samples correctly and reasonably is the core challenge of contrastive learning. It is even harder to discover contrastive objects in multi-label text classification tasks. There are very few contrastive losses proposed previously. In this paper, we investigate the problem from a different angle by proposing five novel contrastive losses for multi-label text classification tasks. These are Strict Contrastive Loss (SCL), Intra-label Contrastive Loss (ICL), Jaccard Similarity Contrastive Loss (JSCL), Jaccard Similarity Probability Contrastive Loss (JSPCL), and Stepwise Label Contrastive Loss (SLCL). We explore the effectiveness of contrastive learning for multi-label text classification tasks by the employment of these novel losses and provide a set of baseline models for deploying contrastive learning techniques on specific tasks. We further perform an interpretable analysis of our approach to show how different components of contrastive learning losses play their roles. The experimental results show that our proposed contrastive losses can bring improvement to multi-label text classification tasks. Our work also explores how contrastive learning should be adapted for multi-label text classification tasks.
翻译:对比学习技术在自然语言处理任务中的有效性仍有待探索和分析。如何正确且合理地构建正负样本是对比学习的核心挑战,而在多标签文本分类任务中发现对比对象则更为困难。此前提出的对比损失函数极少。本文从不同角度研究该问题,提出了五种用于多标签文本分类任务的新型对比损失函数,分别是:严格对比损失(SCL)、标签内对比损失(ICL)、Jaccard相似度对比损失(JSCL)、Jaccard相似度概率对比损失(JSPCL)以及逐步标签对比损失(SLCL)。我们通过应用这些新型损失函数,探索了对比学习对多标签文本分类任务的有效性,并提供了在特定任务上部署对比学习技术的一组基线模型。我们进一步对所提方法进行了可解释性分析,以展示对比损失函数中不同组成部分的作用机制。实验结果表明,我们提出的对比损失函数能够提升多标签文本分类任务的性能。此外,本研究还探讨了对比学习应如何针对多标签文本分类任务进行适配。