Recent developments and emerging use cases, such as smart Internet of Things (IoT) and Edge AI, have sparked considerable interest in the training of neural networks over fully decentralized (serverless) networks. One of the major challenges of decentralized learning is to ensure stable convergence without resorting to strong assumptions applied for each agent regarding data distributions or updating policies. To address these issues, we propose DRACO, a novel method for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks by leveraging continuous communication. Our approach enables edge devices within decentralized networks to perform local training and model exchanging along a continuous timeline, thereby eliminating the necessity for synchronized timing. The algorithm also features a specific technique of decoupling communication and computation schedules, which empowers complete autonomy for all users and manageable instructions for stragglers. Through a comprehensive convergence analysis, we highlight the advantages of asynchronous and autonomous participation in decentralized optimization. Our numerical experiments corroborate the efficacy of the proposed technique.
翻译:近年来,智能物联网(IoT)与边缘人工智能等新兴应用场景的发展,极大地激发了在完全去中心化(无服务器)网络上训练神经网络的兴趣。去中心化学习面临的主要挑战之一,是在无需对每个智能体的数据分布或更新策略施加强假设的前提下,确保训练的稳定收敛。为解决这些问题,我们提出了DRACO——一种基于行随机无线网络、利用连续通信实现去中心化异步随机梯度下降(SGD)的新方法。该方法使得去中心化网络中的边缘设备能够在连续时间线上执行本地训练与模型交换,从而消除了对同步时序的依赖。该算法还具备通信与计算调度解耦的独特技术,既保障了所有用户的完全自主性,又为掉队者提供了可管理的调度指令。通过系统的收敛性分析,我们阐明了异步自主参与在去中心化优化中的优势。数值实验进一步验证了所提方法的有效性。