Understanding Server-Assisted Federated Learning in the Presence of Incomplete Client Participation

Existing works in federated learning (FL) often assume an ideal system with either full client or uniformly distributed client participation. However, in practice, it has been observed that some clients may never participate in FL training (aka incomplete client participation) due to a myriad of system heterogeneity factors. A popular approach to mitigate impacts of incomplete client participation is the server-assisted federated learning (SA-FL) framework, where the server is equipped with an auxiliary dataset. However, despite SA-FL has been empirically shown to be effective in addressing the incomplete client participation problem, there remains a lack of theoretical understanding for SA-FL. Meanwhile, the ramifications of incomplete client participation in conventional FL are also poorly understood. These theoretical gaps motivate us to rigorously investigate SA-FL. Toward this end, we first show that conventional FL is {\em not} PAC-learnable under incomplete client participation in the worst case. Then, we show that the PAC-learnability of FL with incomplete client participation can indeed be revived by SA-FL, which theoretically justifies the use of SA-FL for the first time. Lastly, to provide practical guidance for SA-FL training under {\em incomplete client participation}, we propose the $\mathsf{SAFARI}$ (server-assisted federated averaging) algorithm that enjoys the same linear convergence speedup guarantees as classic FL with ideal client participation assumptions, offering the first SA-FL algorithm with convergence guarantee. Extensive experiments on different datasets show $\mathsf{SAFARI}$ significantly improves the performance under incomplete client participation.

翻译：联邦学习（FL）的现有研究通常假设一个理想系统，即客户端完全参与或均匀分布参与。然而，实践中观察到，由于系统异构性因素，某些客户端可能从不参与FL训练（即客户端不完整参与）。缓解客户端不完整参与影响的一种流行方法是服务器辅助联邦学习（SA-FL）框架，其中服务器配备有辅助数据集。然而，尽管经验表明SA-FL在解决客户端不完整参与问题上有效，但SA-FL仍缺乏理论理解。同时，常规FL中客户端不完整参与的影响也鲜有深入研究。这些理论空白促使我们严谨地探究SA-FL。为此，我们首先证明在最坏情况下，传统FL在客户端不完整参与下并非PAC可学习。接着，我们证明SA-FL能够恢复存在客户端不完整参与时FL的PAC可学习性，这首次从理论上证明了SA-FL的合理性。最后，为在客户端不完整参与下提供SA-FL训练的实践指导，我们提出$\mathsf{SAFARI}$（服务器辅助联邦平均）算法，该算法与具有理想客户端参与假设的经典FL享有相同的线性收敛速度保证，提供了首个具有收敛保证的SA-FL算法。在不同数据集上的大量实验表明，$\mathsf{SAFARI}$在客户端不完整参与下显著提升了性能。