Implementing Decentralized Gradient Descent (DGD) in wireless systems is challenging due to noise, fading, and limited bandwidth, necessitating topology awareness, transmission scheduling, and the acquisition of channel state information (CSI) to mitigate interference and maintain reliable communications. These operations may result in substantial signaling overhead and scalability challenges in large networks lacking central coordination. This paper introduces a scalable DGD algorithm that eliminates the need for scheduling, topology information, or CSI (both average and instantaneous). At its core is a Non-Coherent Over-The-Air (NCOTA) consensus scheme that exploits a noisy energy superposition property of wireless channels. Nodes encode their local optimization signals into energy levels within an OFDM frame and transmit simultaneously, without coordination. The key insight is that the received energy equals, on average, the sum of the energies of the transmitted signals, scaled by their respective average channel gains, akin to a consensus step. This property enables unbiased consensus estimation, utilizing average channel gains as mixing weights, thereby removing the need for their explicit design or for CSI. Introducing a consensus stepsize mitigates consensus estimation errors due to energy fluctuations around their expected values. For strongly-convex problems, it is shown that the expected squared distance between the local and globally optimum models vanishes at a rate of $\mathcal O(1/\sqrt{k})$ after $k$ iterations, with suitable decreasing learning and consensus stepsizes. Extensions accommodate a broad class of fading models and frequency-selective channels. Numerical experiments on image classification demonstrate faster convergence in terms of running time compared to state-of-the-art schemes, especially in dense network scenarios.
翻译:在无线系统中实现去中心化梯度下降(DGD)面临噪声、衰落和有限带宽的挑战,需要拓扑感知、传输调度以及获取信道状态信息(CSI)以减轻干扰并维持可靠通信。这些操作可能在缺乏中心协调的大型网络中导致显著的信令开销和可扩展性问题。本文提出一种可扩展的DGD算法,无需调度、拓扑信息或CSI(包括平均和瞬时)。其核心是一种非相干空中(NCOTA)共识方案,该方案利用了无线信道的噪声能量叠加特性。节点将其局部优化信号编码到OFDM帧内的能量级别中,并在无协调的情况下同时传输。关键洞见在于,接收到的能量在平均意义上等于各传输信号能量之和(按各自平均信道增益缩放),类似于一个共识步骤。这一特性使得能够利用平均信道增益作为混合权重进行无偏共识估计,从而无需显式设计这些权重或获取CSI。引入共识步长可以缓解因能量围绕其期望值波动而产生的共识估计误差。对于强凸问题,研究表明,在采用适当递减的学习率和共识步长后,局部模型与全局最优模型之间的期望平方距离以$\mathcal O(1/\sqrt{k})$的速率在$k$次迭代后消失。扩展方案适用于广泛的衰落模型和频率选择性信道。在图像分类任务上的数值实验表明,相较于现有先进方案,该方法在运行时间方面收敛更快,尤其在密集网络场景中。