Federated learning has become a popular method to learn from decentralized heterogeneous data. Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data due to label scarcity on decentralized clients. Existing FSSL methods assume independent and identically distributed (IID) labeled data across clients and consistent class distribution between labeled and unlabeled data within a client. This work studies a more practical and challenging scenario of FSSL, where data distribution is different not only across clients but also within a client between labeled and unlabeled data. To address this challenge, we propose a novel FSSL framework with dual regulators, FedDure.} FedDure lifts the previous assumption with a coarse-grained regulator (C-reg) and a fine-grained regulator (F-reg): C-reg regularizes the updating of the local model by tracking the learning effect on labeled data distribution; F-reg learns an adaptive weighting scheme tailored for unlabeled instances in each client. We further formulate the client model training as bi-level optimization that adaptively optimizes the model in the client with two regulators. Theoretically, we show the convergence guarantee of the dual regulators. Empirically, we demonstrate that FedDure is superior to the existing methods across a wide range of settings, notably by more than 11% on CIFAR-10 and CINIC-10 datasets.
翻译:联邦学习已成为一种从去中心化异构数据中学习的流行方法。由于去中心化客户端上标签稀缺,联邦半监督学习(FSSL)应运而生,旨在从少量标注数据中训练模型。现有FSSL方法假设客户端间标注数据独立同分布(IID),且客户端内标注数据与未标注数据的类别分布一致。本研究探讨了一种更实际且更具挑战性的FSSL场景,其中数据分布不仅在不同客户端间存在差异,而且在同一客户端内的标注与未标注数据间也存在差异。为解决这一挑战,我们提出了一种新颖的双调节器联邦半监督学习框架FedDure。FedDure通过粗粒度调节器(C-reg)和细粒度调节器(F-reg)突破了先前假设:C-reg通过跟踪标注数据分布上的学习效果来规范局部模型的更新;F-reg为每个客户端中的未标注实例学习一种自适应加权方案。我们进一步将客户端模型训练建模为双层优化,通过两个调节器自适应地优化客户端模型。理论上,我们证明了双调节器的收敛性保证。实验表明,FedDure在多种设置下均优于现有方法,尤其在CIFAR-10和CINIC-10数据集上的性能提升超过11%。