Recent developments and emerging use cases, such as smart Internet of Things (IoT) and Edge AI, have sparked considerable interest in the training of neural networks over fully decentralized (serverless) networks. One of the major challenges of decentralized learning is to ensure stable convergence without resorting to strong assumptions applied for each agent regarding data distributions or updating policies. To address these issues, we propose DRACO, a novel method for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks by leveraging continuous communication. Our approach enables edge devices within decentralized networks to perform local training and model exchanging along a continuous timeline, thereby eliminating the necessity for synchronized timing. The algorithm also features a specific technique of decoupling communication and computation schedules, which empowers complete autonomy for all users and manageable instructions for stragglers. Through a comprehensive convergence analysis, we highlight the advantages of asynchronous and autonomous participation in decentralized optimization. Our numerical experiments corroborate the efficacy of the proposed technique.
翻译:近年来,智能物联网(IoT)和边缘人工智能等新兴技术及用例的发展,极大地激发了人们对在完全去中心化(无服务器)网络上训练神经网络的兴趣。去中心化学习的主要挑战之一在于,如何在不依赖于对每个智能体的数据分布或更新策略施加强假设的前提下,确保稳定的收敛性。为解决这些问题,我们提出了DRACO,一种基于连续通信、在行随机无线网络中进行分布式异步随机梯度下降(SGD)的新方法。该方法使得去中心化网络中的边缘设备能够在连续时间线上执行本地训练和模型交换,从而消除了对同步定时的需求。该算法还采用了一种解耦通信与计算调度的特定技术,既赋予所有用户完全的自主性,又为掉队者提供了可管理的调度指令。通过全面的收敛性分析,我们阐明了异步与自主参与在分布式优化中的优势。数值实验验证了所提方法的有效性。