Federated learning (FL) is a distributed learning paradigm that maximizes the potential of data-driven models for edge devices without sharing their raw data. However, devices often have non-independent and identically distributed (non-IID) data, meaning their local data distributions can vary significantly. The heterogeneity in input data distributions across devices, commonly referred to as the feature shift problem, can adversely impact the training convergence and accuracy of the global model. To analyze the intrinsic causes of the feature shift problem, we develop a generalization error bound in FL, which motivates us to propose FedCiR, a client-invariant representation learning framework that enables clients to extract informative and client-invariant features. Specifically, we improve the mutual information term between representations and labels to encourage representations to carry essential classification knowledge, and diminish the mutual information term between the client set and representations conditioned on labels to promote representations of clients to be client-invariant. We further incorporate two regularizers into the FL framework to bound the mutual information terms with an approximate global representation distribution to compensate for the absence of the ground-truth global representation distribution, thus achieving informative and client-invariant feature extraction. To achieve global representation distribution approximation, we propose a data-free mechanism performed by the server without compromising privacy. Extensive experiments demonstrate the effectiveness of our approach in achieving client-invariant representation learning and solving the data heterogeneity issue.
翻译:联邦学习(FL)是一种分布式学习范式,能在不共享边缘设备原始数据的前提下最大化数据驱动模型的潜力。然而,设备常面临非独立同分布(non-IID)数据,即各设备本地数据分布可能存在显著差异。不同设备间输入数据分布的异质性(即特征偏移问题)会损害全局模型的训练收敛性与准确性。为分析特征偏移问题的内在成因,我们推导了联邦学习中的泛化误差界,并据此提出FedCiR——一种客户端不变表示学习框架,使客户端能够提取信息丰富且具有客户端不变性的特征。具体而言,我们通过增强表示与标签间的互信息项,促使表示携带关键分类知识;同时削弱以标签为条件的客户端集合与表示间的互信息项,推动客户端的表示保持客户端不变性。进一步地,我们在联邦学习框架中引入两个正则化项,通过近似全局表示分布来约束互信息项,从而弥补真实全局表示分布缺失的问题,实现信息丰富且具有客户端不变性的特征提取。为完成全局表示分布近似,我们提出一种由服务器执行的无数据机制,且不损害隐私安全性。大量实验证实了本方法在实现客户端不变表示学习与解决数据异质性问题的有效性。