Federated learning (FL) has enabled distributed learning of a model across multiple clients in a privacy-preserving manner. One of the main challenges of FL is to accommodate clients with varying hardware capacities; clients have differing compute and memory requirements. To tackle this challenge, recent state-of-the-art approaches leverage the use of early exits. Nonetheless, these approaches fall short of mitigating the challenges of joint learning multiple exit classifiers, often relying on hand-picked heuristic solutions for knowledge distillation among classifiers and/or utilizing additional layers for weaker classifiers. In this work, instead of utilizing multiple classifiers, we propose a recurrent early exit approach named ReeFL that fuses features from different sub-models into a single shared classifier. Specifically, we use a transformer-based early-exit module shared among sub-models to i) better exploit multi-layer feature representations for task-specific prediction and ii) modulate the feature representation of the backbone model for subsequent predictions. We additionally present a per-client self-distillation approach where the best sub-model is automatically selected as the teacher of the other sub-models at each client. Our experiments on standard image and speech classification benchmarks across various emerging federated fine-tuning baselines demonstrate ReeFL's effectiveness over previous works.
翻译:联邦学习(FL)使得模型能够在保护隐私的前提下,在多个客户端之间进行分布式学习。FL面临的主要挑战之一是如何适应具有不同硬件能力的客户端;客户端的计算和内存需求各不相同。为了应对这一挑战,当前最先进的方法利用了早期退出机制。然而,这些方法在缓解联合学习多个退出分类器的挑战方面存在不足,通常依赖于手动选择的启发式解决方案来进行分类器之间的知识蒸馏,和/或为较弱的分类器使用额外的层。在本工作中,我们没有使用多个分类器,而是提出了一种名为ReeFL的循环早期退出方法,该方法将来自不同子模型的特征融合到一个共享的单一分类器中。具体来说,我们使用一个在子模型之间共享的、基于Transformer的早期退出模块,以实现:i) 更好地利用多层特征表示进行任务特定预测;ii) 为后续预测调制骨干模型的特征表示。我们还提出了一种针对每个客户端的自蒸馏方法,其中最佳子模型在每个客户端被自动选为其他子模型的教师。我们在各种新兴的联邦微调基准上,对标准图像和语音分类基准进行的实验,证明了ReeFL相对于先前工作的有效性。