In multiple federated learning schemes, a random subset of clients sends in each round their model updates to the server for aggregation. Although this client selection strategy aims to reduce communication overhead, it remains energy and computationally inefficient, especially when considering resource-constrained devices as clients. This is because conventional random client selection overlooks the content of exchanged information and falls short of providing a mechanism to reduce the transmission of semantically redundant data. To overcome this challenge, we propose clustering the clients with the aid of similarity metrics, where a single client from each of the formed clusters is selected in each round to participate in the federated training. To evaluate our approach, we perform an extensive feasibility study considering the use of nine statistical metrics in the clustering process. Simulation results reveal that, when considering a scenario with high data heterogeneity of clients, similarity-based clustering can reduce the number of required rounds compared to the baseline random client selection. In addition, energy consumption can be notably reduced from 23.93% to 41.61%, for those similarity metrics with an equivalent number of clients per round as the baseline random scheme.
翻译:在多种联邦学习方案中,每轮随机选取一部分客户端,将其模型更新发送至服务器进行聚合。尽管这种客户端选择策略旨在降低通信开销,但在客户端为资源受限设备的情况下,其在能耗和计算效率方面仍然不够高效。这是因为传统的随机客户端选择忽略了所交换信息的内容,缺乏减少语义冗余数据传输的机制。为解决这一挑战,我们提出利用相似度度量对客户端进行聚类,在每轮训练中从每个聚类中选取单个客户端参与联邦训练。为评估该方法,我们开展了一项广泛的可行性研究,在聚类过程中使用了九种统计度量。仿真结果表明,在客户端数据高度异构的场景下,基于相似度的聚类方法相比基线随机客户端选择,可减少所需训练轮数。此外,对于每轮客户端数量与基线随机方案相当的相似度度量,能耗可显著降低23.93%至41.61%。