This paper addresses the gradient coding and coded matrix multiplication problems in distributed optimization and coded computing. We present a numerically stable binary coding method which overcomes the drawbacks of the \textit{Fractional Repetition Coding} gradient coding method proposed by Tandon et al., and can also be leveraged by coded computing networks whose servers are of heterogeneous nature. Specifically, we propose a construction for fractional repetition gradient coding; while ensuring that the generator matrix remains close to perfectly balanced for any set of coded parameters, as well as a low complexity decoding step. The proposed binary encoding avoids operations over the real and complex numbers which are inherently numerically unstable, thereby enabling numerically stable distributed encodings of the partial gradients. We then make connections between gradient coding and coded matrix multiplication. Specifically, we show that any gradient coding scheme can be extended to coded matrix multiplication. Furthermore, we show how the proposed binary gradient coding scheme can be used to construct two different coded matrix multiplication schemes, each achieving different trade-offs.
翻译:本文研究了分布式优化和编码计算中的梯度编码与编码矩阵乘法问题。我们提出了一种数值稳定的二进制编码方法,该方法克服了Tandon等人提出的分数重复编码梯度编码方案的缺陷,并能应用于包含异构服务器的编码计算网络。具体而言,我们提出了一种分数重复梯度编码的构造方法,确保生成矩阵对任意编码参数集保持近乎完美的平衡性,同时实现低复杂度解码。所提出的二进制编码避免了实数和复数运算(这些运算本质上存在数值不稳定性),从而实现了部分梯度的数值稳定分布式编码。随后,我们建立了梯度编码与编码矩阵乘法之间的联系,证明了任意梯度编码方案均可扩展至编码矩阵乘法。进一步地,我们展示了如何利用所提出的二进制梯度编码方案构建两种不同的编码矩阵乘法方案,每种方案均实现不同的性能权衡。