Federated learning has become a popular method to learn from decentralized heterogeneous data. Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data due to label scarcity on decentralized clients. Existing FSSL methods assume independent and identically distributed (IID) labeled data across clients and consistent class distribution between labeled and unlabeled data within a client. This work studies a more practical and challenging scenario of FSSL, where data distribution is different not only across clients but also within a client between labeled and unlabeled data. To address this challenge, we propose a novel FSSL framework with dual regulators, FedDure. FedDure lifts the previous assumption with a coarse-grained regulator (C-reg) and a fine-grained regulator (F-reg): C-reg regularizes the updating of the local model by tracking the learning effect on labeled data distribution; F-reg learns an adaptive weighting scheme tailored for unlabeled instances in each client. We further formulate the client model training as bi-level optimization that adaptively optimizes the model in the client with two regulators. Theoretically, we show the convergence guarantee of the dual regulators. Empirically, we demonstrate that FedDure is superior to the existing methods across a wide range of settings, notably by more than 11 on CIFAR-10 and CINIC-10 datasets.
翻译:联邦学习已发展为从分散异构数据中学习的流行方法。由于分散客户端标签数据稀缺,联邦半监督学习(FSSL)应运而生,旨在利用少量标注数据训练模型。现有FSSL方法假设客户端间标注数据独立同分布(IID),且同一客户端内标注数据与未标注数据的类别分布一致。本文研究更实际且更具挑战性的FSSL场景:数据分布不仅在客户端间存在差异,而且同一客户端内标注数据与未标注数据的分布也不同。为此,我们提出带有双调节器的新型FSSL框架FedDure。FedDure通过粗粒度调节器(C-reg)和细粒度调节器(F-reg)突破原有假设:C-reg跟踪标注数据分布的学习效应以正则化局部模型更新;F-reg学习为每个客户端未标注实例定制的自适应加权方案。我们进一步将客户端模型训练形式化为双层级优化问题,通过两个调节器自适应优化客户端模型。理论上,我们证明了双调节器的收敛保证。实验表明,FedDure在多种设置下均优于现有方法,尤其在CIFAR-10和CINIC-10数据集上提升超过11%。