Federated learning (FL) has enabled distributed learning of a model across multiple clients in a privacy-preserving manner. One of the main challenges of FL is to accommodate clients with varying hardware capacities; clients have differing compute and memory requirements. To tackle this challenge, recent state-of-the-art approaches leverage the use of early exits. Nonetheless, these approaches fall short of mitigating the challenges of joint learning multiple exit classifiers, often relying on hand-picked heuristic solutions for knowledge distillation among classifiers and/or utilizing additional layers for weaker classifiers. In this work, instead of utilizing multiple classifiers, we propose a recurrent early exit approach named ReeFL that fuses features from different sub-models into a single shared classifier. Specifically, we use a transformer-based early-exit module shared among sub-models to i) better exploit multi-layer feature representations for task-specific prediction and ii) modulate the feature representation of the backbone model for subsequent predictions. We additionally present a per-client self-distillation approach where the best sub-model is automatically selected as the teacher of the other sub-models at each client. Our experiments on standard image and speech classification benchmarks across various emerging federated fine-tuning baselines demonstrate ReeFL's effectiveness over previous works.
翻译:联邦学习(FL)使得模型能够在保护隐私的前提下,在多个客户端之间进行分布式学习。FL面临的主要挑战之一是如何适应具有不同硬件能力的客户端;这些客户端在计算和内存需求上存在差异。为了应对这一挑战,当前最先进的方法利用了早期退出机制。然而,这些方法在缓解联合学习多个退出分类器所面临的挑战方面存在不足,通常依赖于人工选取的启发式方案进行分类器之间的知识蒸馏,和/或为较弱的分类器使用额外的层。在本工作中,我们提出了一种名为ReeFL的循环早期退出方法,该方法将来自不同子模型的特征融合到一个共享的单一分类器中,而不是使用多个分类器。具体而言,我们使用一个在子模型之间共享的、基于Transformer的早期退出模块,以 i) 更好地利用多层特征表示进行任务特定预测,以及 ii) 为后续预测调制主干模型的特征表示。我们还提出了一种按客户端进行的自蒸馏方法,其中每个客户端的最佳子模型会自动被选为其他子模型的教师。我们在各种新兴的联邦微调基线上,对标准图像和语音分类基准进行的实验证明了ReeFL相较于先前工作的有效性。