With the arising concerns of privacy within machine learning, federated learning (FL) was invented in 2017, in which the clients, such as mobile devices, train a model and send the update to the centralized server. Choosing clients randomly for FL can harm learning performance due to different reasons. Many studies have proposed approaches to address the challenges of client selection of FL. However, no systematic literature review (SLR) on this topic existed. This SLR investigates the state of the art of client selection in FL and answers the challenges, solutions, and metrics to evaluate the solutions. We systematically reviewed 47 primary studies. The main challenges found in client selection are heterogeneity, resource allocation, communication costs, and fairness. The client selection schemes aim to improve the original random selection algorithm by focusing on one or several of the aforementioned challenges. The most common metric used is testing accuracy versus communication rounds, as testing accuracy measures the successfulness of the learning and preferably in as few communication rounds as possible, as they are very expensive. Although several possible improvements can be made with the current state of client selection, the most beneficial ones are evaluating the impact of unsuccessful clients and gaining a more theoretical understanding of the impact of fairness in FL.
翻译:随着机器学习中隐私问题的日益凸显,联邦学习(FL)于2017年被提出——客户端(如移动设备)训练模型后将更新发送给中心服务器。由于多种原因,随机选择FL客户端可能损害学习性能。已有诸多研究提出解决FL客户端选择挑战的方法,但尚未有关于该主题的系统性文献综述(SLR)。本SLR旨在探究FL中客户端选择的最新进展,回答其面临的挑战、解决方案及评估指标。我们系统性地回顾了47篇核心研究。客户端选择的主要挑战包括异构性、资源分配、通信成本与公平性。客户端选择方案旨在通过聚焦上述一项或多项挑战来改进原始随机选择算法。最常用的评估指标是测试准确率与通信轮次的权衡——测试准确率衡量学习成功度,且应尽可能在更少的通信轮次中实现(因为通信成本极高)。尽管当前客户端选择研究存在多种改进可能,但最具价值的两个方向是:评估无效客户端的影响,以及从理论层面深化理解公平性对FL的影响。