We present a new approach called MeritOpt based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the datasets of South East Asian and Finno-Ugric languages. In addition to its effectiveness, MeritOpt is also highly interpretable, as it can be applied to track the impact of each language used for training. Our analysis reveals that target dataset size affects weight distribution across auxiliary languages, that unrelated languages do not interfere with the training, and auxiliary optimizer parameters have minimal impact. Our approach is easy to apply with a few lines of code, and we provide scripts for reproducing the experiments at https://github.com/VityaVitalich/MeritOpt.
翻译:我们提出了一种名为MeritOpt的新方法,该方法基于个性化联邦学习算法MeritFed,可应用于具有异构数据的自然语言处理任务。我们在低资源机器翻译任务上对其进行了评估,使用了东南亚和芬兰-乌戈尔语系的数据集。除了其有效性外,MeritOpt还具有高度可解释性,因为它可用于追踪训练中使用的每种语言的影响。我们的分析表明,目标数据集的大小会影响辅助语言间的权重分布,不相关的语言不会干扰训练,并且辅助优化器参数的影响微乎其微。我们的方法易于应用,仅需几行代码即可实现,我们在https://github.com/VityaVitalich/MeritOpt 提供了用于复现实验的脚本。