Decentralized federated learning (DFL) enables edge devices to collaboratively train models through local training and fully decentralized device-to-device (D2D) model exchanges. However, these energy-intensive operations often rapidly deplete limited device batteries, reducing their operational lifetime and degrading the learning performance. To address this limitation, we apply energy harvesting technique to DFL systems, allowing edge devices to extract ambient energy and operate sustainably. We first derive the convergence bound for wireless DFL with energy harvesting, showing that the convergence is influenced by partial device participation and transmission packet drops, both of which further depend on the available energy supply. To accelerate convergence, we formulate a joint device scheduling and power control problem and model it as a multi-agent Markov decision process (MDP). Traditional MDP algorithms (e.g., value or policy iteration) require a centralized coordinator with access to all device states and exhibit exponential complexity in the number of devices, making them impractical for large-scale decentralized networks. To overcome these challenges, we propose a fully decentralized policy iteration algorithm that leverages only local state information from two-hop neighboring devices, thereby substantially reducing both communication overhead and computational complexity. We further provide a theoretical analysis showing that the proposed decentralized algorithm achieves asymptotic optimality. Finally, comprehensive numerical experiments on real-world datasets are conducted to validate the theoretical results and corroborate the effectiveness of the proposed algorithm.
翻译:去中心化联邦学习(DFL)使边缘设备能够通过本地训练和完全去中心化的设备间(D2D)模型交换来协作训练模型。然而,这些高能耗操作通常会迅速耗尽设备有限的电池电量,缩短其运行寿命并降低学习性能。为应对这一局限,我们将能量收集技术应用于DFL系统,使边缘设备能够从环境中获取能量并实现可持续运行。我们首先推导了无线能量收集DFL的收敛界,表明收敛性能受部分设备参与和传输丢包的影响,而这两者又进一步取决于可用能量供应。为加速收敛,我们构建了一个联合设备调度与功率控制问题,并将其建模为多智能体马尔可夫决策过程(MDP)。传统MDP算法(如值迭代或策略迭代)需要能够访问所有设备状态的集中式协调器,且其计算复杂度随设备数量呈指数增长,因此不适用于大规模去中心化网络。为克服这些挑战,我们提出了一种完全去中心化的策略迭代算法,该算法仅利用来自两跳邻域设备的局部状态信息,从而显著降低了通信开销和计算复杂度。我们进一步提供了理论分析,证明所提出的去中心化算法能够渐近达到最优性。最后,通过在真实数据集上进行全面的数值实验,验证了理论结果并证实了所提算法的有效性。