DualFed: Enjoying both Generalization and Personalization in Federated Learning via Hierachical Representations

In personalized federated learning (PFL), it is widely recognized that achieving both high model generalization and effective personalization poses a significant challenge due to their conflicting nature. As a result, existing PFL methods can only manage a trade-off between these two objectives. This raises an interesting question: Is it feasible to develop a model capable of achieving both objectives simultaneously? Our paper presents an affirmative answer, and the key lies in the observation that deep models inherently exhibit hierarchical architectures, which produce representations with various levels of generalization and personalization at different stages. A straightforward approach stemming from this observation is to select multiple representations from these layers and combine them to concurrently achieve generalization and personalization. However, the number of candidate representations is commonly huge, which makes this method infeasible due to high computational costs.To address this problem, we propose DualFed, a new method that can directly yield dual representations correspond to generalization and personalization respectively, thereby simplifying the optimization task. Specifically, DualFed inserts a personalized projection network between the encoder and classifier. The pre-projection representations are able to capture generalized information shareable across clients, and the post-projection representations are effective to capture task-specific information on local clients. This design minimizes the mutual interference between generalization and personalization, thereby achieving a win-win situation. Extensive experiments show that DualFed can outperform other FL methods. Code is available at https://github.com/GuogangZhu/DualFed.

翻译：在个性化联邦学习（PFL）中，由于模型泛化与个性化之间存在本质冲突，同时实现高泛化能力与有效的个性化被广泛认为是一项重大挑战。因此，现有的PFL方法只能在这两个目标之间进行权衡。这引出了一个有趣的问题：能否开发出一种能够同时实现这两个目标的模型？本文给出了肯定的答案，其关键在于我们观察到深度模型天然具有分层架构，能够在不同阶段产生具有不同泛化与个性化程度的表征。基于这一观察，一种直接的方法是选取这些层中的多个表征并将其组合，以同时实现泛化与个性化。然而，候选表征的数量通常非常庞大，导致该方法因计算成本过高而难以实施。为解决这一问题，我们提出了DualFed，该方法能够直接生成分别对应泛化与个性化的双重表征，从而简化优化任务。具体而言，DualFed在编码器与分类器之间插入一个个性化投影网络。投影前的表征能够捕获跨客户端可共享的泛化信息，而投影后的表征则能有效捕获本地客户端的任务特定信息。这一设计最大限度地减少了泛化与个性化之间的相互干扰，从而实现双赢。大量实验表明，DualFed能够超越其他联邦学习方法。代码可在 https://github.com/GuogangZhu/DualFed 获取。