Standard methods for multi-label text classification largely rely on encoder-only pre-trained language models, whereas encoder-decoder models have proven more effective in other classification tasks. In this study, we compare four methods for multi-label classification, two based on an encoder only, and two based on an encoder-decoder. We carry out experiments on four datasets -two in the legal domain and two in the biomedical domain, each with two levels of label granularity- and always depart from the same pre-trained model, T5. Our results show that encoder-decoder methods outperform encoder-only methods, with a growing advantage on more complex datasets and labeling schemes of finer granularity. Using encoder-decoder models in a non-autoregressive fashion, in particular, yields the best performance overall, so we further study this approach through ablations to better understand its strengths.
翻译:多标签文本分类的标准方法在很大程度上依赖于仅编码器的预训练语言模型,而编码器-解码器模型在其他分类任务中被证明更加有效。在本研究中,我们比较了四种多标签分类方法:两种基于仅编码器,两种基于编码器-解码器。我们在四个数据集上进行了实验——其中两个来自法律领域,两个来自生物医学领域,每个数据集包含两个标签粒度级别——并始终基于相同的预训练模型T5。我们的结果表明,编码器-解码器方法优于仅编码器方法,并且在更复杂的数据集和更细粒度的标签方案上优势更为明显。特别是,以非自回归方式使用编码器-解码器模型总体表现最佳,因此我们通过消融研究进一步探究了这种方法,以更好地理解其优势。