We propose two methods to make unsupervised domain adaptation (UDA) more parameter efficient using adapters, small bottleneck layers interspersed with every layer of the large-scale pre-trained language model (PLM). The first method deconstructs UDA into a two-step process: first by adding a domain adapter to learn domain-invariant information and then by adding a task adapter that uses domain-invariant information to learn task representations in the source domain. The second method jointly learns a supervised classifier while reducing the divergence measure. Compared to strong baselines, our simple methods perform well in natural language inference (MNLI) and the cross-domain sentiment classification task. We even outperform unsupervised domain adaptation methods such as DANN and DSN in sentiment classification, and we are within 0.85% F1 for natural language inference task, by fine-tuning only a fraction of the full model parameters. We release our code at https://github.com/declare-lab/domadapter
翻译:我们提出了两种利用适配器(即穿插在大规模预训练语言模型每一层中的小型瓶颈层)来提高无监督领域自适应参数效率的方法。第一种方法将无监督领域自解析为两步过程:首先添加领域适配器以学习领域不变信息,然后添加任务适配器,利用领域不变信息在源领域中学习任务表示。第二种方法在减少散度度量的同时联合学习监督分类器。与强基线相比,我们的简单方法在自然语言推理(MNLI)和跨领域情感分类任务中表现良好。在情感分类中,我们甚至超越了DANN和DSN等无监督领域自适应方法;在自然语言推理任务中,我们仅微调了整个模型参数的一小部分,就达到了0.85% F1以内的性能。我们已在https://github.com/declare-lab/domadapter上发布了代码。