DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

Multi-Source cross-lingual transfer learning deals with the transfer of task knowledge from multiple labelled source languages to an unlabeled target language under the language shift. Existing methods typically focus on weighting the predictions produced by language-specific classifiers of different sources that follow a shared encoder. However, all source languages share the same encoder, which is updated by all these languages. The extracted representations inevitably contain different source languages' information, which may disturb the learning of the language-specific classifiers. Additionally, due to the language gap, language-specific classifiers trained with source labels are unable to make accurate predictions for the target language. Both facts impair the model's performance. To address these challenges, we propose a Disentangled and Adaptive Network (DA-Net). Firstly, we devise a feedback-guided collaborative disentanglement method that seeks to purify input representations of classifiers, thereby mitigating mutual interference from multiple sources. Secondly, we propose a class-aware parallel adaptation method that aligns class-level distributions for each source-target language pair, thereby alleviating the language pairs' language gap. Experimental results on three different tasks involving 38 languages validate the effectiveness of our approach.

翻译：多源跨语言迁移学习旨在将多个已标注源语言的任务知识迁移至一个未标注目标语言，并处理语言差异带来的挑战。现有方法通常侧重于对共享编码器后不同源语言特定分类器产生的预测结果进行加权。然而，所有源语言共享同一个编码器，该编码器由这些语言共同更新，因此提取的表征不可避免地包含不同源语言的信息，这可能干扰语言特定分类器的学习。此外，由于语言鸿沟的存在，使用源语言标签训练的语言特定分类器难以对目标语言做出准确预测。这两个因素均会损害模型性能。为应对这些挑战，我们提出了一种解耦与自适应网络（DA-Net）。首先，我们设计了一种反馈引导的协同解耦方法，旨在净化分类器的输入表征，从而减轻多源间的相互干扰。其次，我们提出了一种类别感知的并行自适应方法，通过对每个源语言-目标语言对进行类别级分布对齐，以缓解语言对之间的语言鸿沟。在涉及38种语言的三个不同任务上的实验结果验证了本方法的有效性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日