We propose two methods to make unsupervised domain adaptation (UDA) more parameter efficient using adapters, small bottleneck layers interspersed with every layer of the large-scale pre-trained language model (PLM). The first method deconstructs UDA into a two-step process: first by adding a domain adapter to learn domain-invariant information and then by adding a task adapter that uses domain-invariant information to learn task representations in the source domain. The second method jointly learns a supervised classifier while reducing the divergence measure. Compared to strong baselines, our simple methods perform well in natural language inference (MNLI) and the cross-domain sentiment classification task. We even outperform unsupervised domain adaptation methods such as DANN and DSN in sentiment classification, and we are within 0.85% F1 for natural language inference task, by fine-tuning only a fraction of the full model parameters. We release our code at https://github.com/declare-lab/UDAPTER
翻译:我们提出了两种方法,通过使用适配器(即穿插在大规模预训练语言模型每一层中的小型瓶颈层)来提升无监督域适应的参数效率。第一种方法将无监督域适应分解为两步过程:首先添加域适配器以学习域不变信息,然后添加任务适配器,利用域不变信息在源域中学习任务表示。第二种方法在减小分布差异度量的同时联合学习有监督分类器。与强基线方法相比,我们的简单方法在自然语言推理(MNLI)和跨域情感分类任务中表现良好。在情感分类任务上,我们甚至优于DANN和DSN等无监督域适应方法,而在自然语言推理任务中,通过仅微调全部模型参数的一小部分,我们的F1值差距在0.85%以内。我们已在https://github.com/declare-lab/UDAPTER上发布代码。