Federated Learning (FL) stands to gain significant advantages from collaboratively training capacity-heterogeneous models, enabling the utilization of private data and computing power from low-capacity devices. However, the focus on personalizing capacity-heterogeneous models based on client-specific data has been limited, resulting in suboptimal local model utility, particularly for low-capacity clients. The heterogeneity in both data and device capacity poses two key challenges for model personalization: 1) accurately retaining necessary knowledge embedded within reduced submodels for each client, and 2) effectively sharing knowledge through aggregating size-varying parameters. To this end, we introduce Pa3dFL, a novel framework designed to enhance local model performance by decoupling and selectively sharing knowledge among capacity-heterogeneous models. First, we decompose each layer of the model into general and personal parameters. Then, we maintain uniform sizes for the general parameters across clients and aggregate them through direct averaging. Subsequently, we employ a hyper-network to generate size-varying personal parameters for clients using learnable embeddings. Finally, we facilitate the implicit aggregation of personal parameters by aggregating client embeddings through a self-attention module. We conducted extensive experiments on three datasets to evaluate the effectiveness of Pa3dFL. Our findings indicate that Pa3dFL consistently outperforms baseline methods across various heterogeneity settings. Moreover, Pa3dFL demonstrates competitive communication and computation efficiency compared to baseline approaches, highlighting its practicality and adaptability in adverse system conditions.
翻译:联邦学习(FL)通过协同训练容量异构模型,能够充分利用低容量设备的私有数据和计算资源,从而获得显著优势。然而,针对基于客户端特定数据的容量异构模型个性化研究尚不充分,导致本地模型效用欠佳,尤其对低容量客户端而言。数据与设备容量的双重异构性为模型个性化带来两大关键挑战:1)如何为每个客户端准确保留嵌入在缩减子模型中的必要知识;2)如何通过聚合尺寸可变的参数实现有效的知识共享。为此,我们提出Pa3dFL这一新颖框架,旨在通过解耦和选择性共享容量异构模型间的知识来提升本地模型性能。首先,我们将模型的每一层分解为通用参数与个性化参数。随后,我们在各客户端间保持通用参数的统一尺寸,并通过直接平均进行聚合。接着,我们采用超网络结合可学习嵌入为客户端生成尺寸可变的个性化参数。最后,我们通过自注意力模块聚合客户端嵌入,实现个性化参数的隐式聚合。我们在三个数据集上进行了大量实验以评估Pa3dFL的有效性。实验结果表明,Pa3dFL在各种异构设置下均持续优于基线方法。此外,与基线方法相比,Pa3dFL展现出具有竞争力的通信与计算效率,突显了其在不利系统条件下的实用性与适应性。