Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as the major semi-supervised approaches to improve natural language understanding (NLU) tasks with massive amount of unlabeled data. However, it's unclear whether they learn similar representations or they can be effectively combined. In this paper, we show that TAPT and ST can be complementary with simple TFS protocol by following TAPT -> Finetuning -> Self-training (TFS) process. Experimental results show that TFS protocol can effectively utilize unlabeled data to achieve strong combined gains consistently across six datasets covering sentiment classification, paraphrase identification, natural language inference, named entity recognition and dialogue slot classification. We investigate various semi-supervised settings and consistently show that gains from TAPT and ST can be strongly additive by following TFS procedure. We hope that TFS could serve as an important semi-supervised baseline for future NLP studies.
翻译:任务自适应预训练(TAPT)和自训练(ST)已成为利用海量无标注数据改进自然语言理解(NLU)任务的主要半监督方法。然而,目前尚不清楚这两种方法是否学习到相似的表示,或能否有效结合。本文证明,通过遵循TAPT→微调→自训练(TFS)的简单流程,TAPT和ST可实现互补。实验结果表明,TFS流程能有效利用无标注数据,在涵盖情感分类、释义识别、自然语言推理、命名实体识别和对话槽位分类的六个数据集上持续获得显著的联合增益。我们考察了多种半监督设置,一致表明遵循TFS流程后,TAPT和ST的增益可形成强累加效应。期望TFS能成为未来NLP研究中重要的半监督基线方法。