This paper studies a new task of federated learning (FL) for semantic parsing, where multiple clients collaboratively train one global model without sharing their semantic parsing data. By leveraging data from multiple clients, the FL paradigm can be especially beneficial for clients that have little training data to develop a data-hungry neural semantic parser on their own. We propose an evaluation setup to study this task, where we re-purpose widely-used single-domain text-to-SQL datasets as clients to form a realistic heterogeneous FL setting and collaboratively train a global model. As standard FL algorithms suffer from the high client heterogeneity in our realistic setup, we further propose a novel LOss Reduction Adjusted Re-weighting (Lorar) mechanism to mitigate the performance degradation, which adjusts each client's contribution to the global model update based on its training loss reduction during each round. Our intuition is that the larger the loss reduction, the further away the current global model is from the client's local optimum, and the larger weight the client should get. By applying Lorar to three widely adopted FL algorithms (FedAvg, FedOPT and FedProx), we observe that their performance can be improved substantially on average (4%-20% absolute gain under MacroAvg) and that clients with smaller datasets enjoy larger performance gains. In addition, the global model converges faster for almost all the clients.
翻译:本文研究了联邦学习(FL)在语义解析中的新任务,其中多个客户端在不共享语义解析数据的情况下协同训练一个全局模型。通过利用多个客户端的数据,FL范式对于自身训练数据有限、难以独立开发数据密集型神经语义解析器的客户端尤其有益。我们提出了一套评估设置来研究该任务,将广泛使用的单领域文本转SQL数据集重新用作客户端,构成一个现实异构联邦学习环境,并协同训练全局模型。由于标准FL算法在我们现实设置中面临高客户端异质性问题,我们进一步提出了一种新颖的损失缩减自适应重加权(Lorar)机制来缓解性能退化,该机制根据每轮训练中客户端损失缩减的程度调整其对全局模型更新的贡献。我们的直觉是:损失缩减越大,当前全局模型距离该客户端局部最优解越远,该客户端应获得更大的权重。通过将Lorar应用于三种广泛采用的FL算法(FedAvg、FedOPT和FedProx),我们观察到其平均性能显著提升(在MacroAvg指标下获得4%-20%的绝对增益),且数据集较小的客户端获得更大的性能提升。此外,几乎所有客户端的全局模型收敛速度均有所加快。