In Federated Learning (FL), forgetting, or the loss of knowledge across rounds, hampers algorithm convergence, particularly in the presence of severe data heterogeneity among clients. This study explores the nuances of this issue, emphasizing the critical role of forgetting in FL's inefficient learning within heterogeneous data contexts. Knowledge loss occurs in both client-local updates and server-side aggregation steps; addressing one without the other fails to mitigate forgetting. We introduce a metric to measure forgetting granularly, ensuring distinct recognition amid new knowledge acquisition. Leveraging these insights, we propose Flashback, an FL algorithm with a dynamic distillation approach that is used to regularize the local models, and effectively aggregate their knowledge. Across different benchmarks, Flashback outperforms other methods, mitigates forgetting, and achieves faster round-to-target-accuracy, by converging in 6 to 16 rounds.
翻译:摘要:在联邦学习中,遗忘现象(即跨轮次的知识丢失)会阻碍算法收敛,尤其在客户端间数据高度异构的情况下。本研究深入探讨了该问题的细微之处,强调了遗忘在异构数据环境下导致联邦学习效率低下的关键作用。知识丢失既发生在客户端本地更新阶段,也发生在服务器端聚合步骤中;仅针对其中一方而忽略另一方无法有效缓解遗忘。我们提出了一种用于细粒度度量遗忘的指标,确保在新知识获取过程中能够清晰区分遗忘现象。基于这些见解,我们设计了Flashback算法,该算法采用动态蒸馏方法来正则化本地模型并有效聚合其知识。在多个基准测试中,Flashback均优于其他方法,缓解了遗忘现象,并在6至16轮内实现收敛,从而更快达到目标精度。