Decentralized learning (DL) systems have been gaining popularity because they avoid raw data sharing by communicating only model parameters, hence preserving data confidentiality. However, the large size of deep neural networks poses a significant challenge for decentralized training, since each node needs to exchange gigabytes of data, overloading the network. In this paper, we address this challenge with JWINS, a communication-efficient and fully decentralized learning system that shares only a subset of parameters through sparsification. JWINS uses wavelet transform to limit the information loss due to sparsification and a randomized communication cut-off that reduces communication usage without damaging the performance of trained models. We demonstrate empirically with 96 DL nodes on non-IID datasets that JWINS can achieve similar accuracies to full-sharing DL while sending up to 64% fewer bytes. Additionally, on low communication budgets, JWINS outperforms the state-of-the-art communication-efficient DL algorithm CHOCO-SGD by up to 4x in terms of network savings and time.
翻译:去中心化学习系统近年来广受欢迎,因其仅通过共享模型参数避免原始数据交换,从而保护数据机密性。然而,深度神经网络的大规模参数对去中心化训练构成重大挑战——每个节点需交换数吉字节数据,导致网络过载。本文提出JWINS系统应对该挑战,这是一种通信高效且完全去中心化的学习系统,通过稀疏化仅共享部分参数。JWINS采用小波变换限制稀疏化带来的信息损失,并引入随机通信截断机制,在降低通信量的同时不削弱训练模型性能。我们在96个DL节点上的非独立同分布数据集实验中证明:JWINS在发送字节量减少高达64%的情况下,仍能达到与全共享DL相近的准确率。此外,在低通信预算条件下,JWINS在网络开销节省和时间效率上,较当前最先进的通信高效DL算法CHOCO-SGD提升高达4倍。