The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still room for new, improved approaches. In this work, we introduce AttentionMix, a novel mixing method that relies on attention-based information. While the paper focuses on the BERT attention mechanism, the proposed approach can be applied to generally any attention-based model. AttentionMix is evaluated on 3 standard sentiment classification datasets and in all three cases outperforms two benchmark approaches that utilize Mixup mechanism, as well as the vanilla BERT method. The results confirm that the attention-based information can be effectively used for data augmentation in the NLP domain.
翻译:Mixup方法已被证明是计算机视觉领域中一种强大的数据增强技术,其后续诸多方法以引导方式实现图像混合。一个有趣的研究方向是将Mixup的核心思想迁移至其他领域,例如自然语言处理(NLP)。尽管已有多种方法将Mixup应用于文本数据,但仍有改进空间。本文提出了一种新颖的混合方法AttentionMix,它依赖于基于注意力的信息。虽然本文聚焦于BERT注意力机制,但所提方法可推广至任何基于注意力的模型。我们在三个标准情感分类数据集上评估了AttentionMix,在所有三个数据集上,它不仅优于两种利用Mixup机制的基准方法,还优于普通BERT方法。实验结果证实,基于注意力的信息可有效用于NLP领域的数据增强。