Federated learning effectively addresses issues such as data privacy by collaborating across participating devices to train global models. However, factors such as network topology and device computing power can affect its training or communication process in complex network environments. A new network architecture and paradigm with computing-measurable, perceptible, distributable, dispatchable, and manageable capabilities, computing and network convergence (CNC) of 6G networks can effectively support federated learning training and improve its communication efficiency. By guiding the participating devices' training in federated learning based on business requirements, resource load, network conditions, and arithmetic power of devices, CNC can reach this goal. In this paper, to improve the communication efficiency of federated learning in complex networks, we study the communication efficiency optimization of federated learning for computing and network convergence of 6G networks, methods that gives decisions on its training process for different network conditions and arithmetic power of participating devices in federated learning. The experiments address two architectures that exist for devices in federated learning and arrange devices to participate in training based on arithmetic power while achieving optimization of communication efficiency in the process of transferring model parameters. The results show that the method we proposed can (1) cope well with complex network situations (2) effectively balance the delay distribution of participating devices for local training (3) improve the communication efficiency during the transfer of model parameters (4) improve the resource utilization in the network.
翻译:联邦学习通过协作参与设备训练全局模型,有效解决了数据隐私等问题。然而,在复杂网络环境中,网络拓扑结构和设备算力等因素可能影响其训练或通信过程。6G网络的算网融合(CNC)作为一种具备计算可度量、可感知、可分发、可调度和可管理能力的新型网络架构与范式,能够有效支持联邦学习训练并提升其通信效率。通过基于业务需求、资源负载、网络状况和设备算力来引导联邦学习中的参与设备训练,CNC可实现这一目标。本文针对复杂网络环境下联邦学习的通信效率优化问题,研究了面向6G网络算网融合的联邦学习通信效率优化方法,该方法可根据不同网络状况和联邦学习中参与设备的算力对其训练过程做出决策。实验针对联邦学习中设备存在的两种架构,根据算力安排设备参与训练,同时实现了模型参数传递过程中的通信效率优化。结果表明,我们提出的方法能够:(1) 良好应对复杂网络情况;(2) 有效平衡参与设备本地训练的时延分布;(3) 提升模型参数传递过程中的通信效率;(4) 提高网络中的资源利用率。