Federated learning has become a popular method to learn from decentralized heterogeneous data. Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data due to label scarcity on decentralized clients. Existing FSSL methods assume independent and identically distributed (IID) labeled data across clients and consistent class distribution between labeled and unlabeled data within a client. This work studies a more practical and challenging scenario of FSSL, where data distribution is different not only across clients but also within a client between labeled and unlabeled data. To address this challenge, we propose a novel FSSL framework with dual regulators, FedDure.} FedDure lifts the previous assumption with a coarse-grained regulator (C-reg) and a fine-grained regulator (F-reg): C-reg regularizes the updating of the local model by tracking the learning effect on labeled data distribution; F-reg learns an adaptive weighting scheme tailored for unlabeled instances in each client. We further formulate the client model training as bi-level optimization that adaptively optimizes the model in the client with two regulators. Theoretically, we show the convergence guarantee of the dual regulators. Empirically, we demonstrate that FedDure is superior to the existing methods across a wide range of settings, notably by more than 11% on CIFAR-10 and CINIC-10 datasets.
翻译:联邦学习已成为从分散的异构数据中学习的主流方法。由于标签稀缺,联邦半监督学习(FSSL)应运而生,旨在利用分散客户端中少量标注数据训练模型。现有FSSL方法假设客户端间的标注数据独立同分布(IID),且同一客户端内标注与未标注数据的类别分布一致。本文研究了一个更实际且更具挑战性的FSSL场景:数据分布不仅在不同客户端间存在差异,同一客户端内标注与未标注数据间的分布也存在差异。为应对这一挑战,我们提出一种基于双调节器的新型FSSL框架FedDure。FedDure摒弃了先前假设,引入粗粒度调节器(C-reg)和细粒度调节器(F-reg):C-reg通过追踪标注数据分布上的学习效果来正则化局部模型的更新;F-reg学习针对各客户端未标注实例的自适应加权方案。我们进一步将客户端模型训练建模为双层优化问题,利用两个调节器自适应优化客户端模型。理论上,我们证明了双调节器的收敛保证。实验结果表明,FedDure在多种设置下均优于现有方法,尤其在CIFAR-10和CINIC-10数据集上性能提升超过11%。