Despite significant advancements in multi-label text classification, the ability of existing models to generalize to novel and seldom-encountered complex concepts, which are compositions of elementary ones, remains underexplored. This research addresses this gap. By creating unique data splits across three benchmarks, we assess the compositional generalization ability of existing multi-label text classification models. Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training, leading to inferior performance on tests with these new combinations. To address this, we introduce a data augmentation method that leverages two innovative text generation models designed to enhance the classification models' capacity for compositional generalization. Our experiments show that this data augmentation approach significantly improves the compositional generalization capabilities of classification models on our benchmarks, with both generation models surpassing other text generation baselines.
翻译:尽管多标签文本分类取得了显著进展,现有模型对由基本概念组合而成的新颖且罕见复杂概念的泛化能力仍未被充分探索。本研究旨在填补这一空白。通过在三个基准数据集上创建独特的数据划分,我们评估了现有多标签文本分类模型的组合泛化能力。结果表明,这些模型往往无法泛化到训练中较少出现的组合概念,导致在这些新组合的测试中表现不佳。为解决这一问题,我们提出了一种数据增强方法,该方法利用两种创新的文本生成模型来提升分类模型的组合泛化能力。实验表明,这种数据增强方法显著提高了分类模型在基准数据集上的组合泛化性能,且两种生成模型均优于其他文本生成基线方法。