This paper considers the problem of calculating the matrix multiplication of two massive matrices $\mathbf{A}$ and $\mathbf{B}$ distributedly. We provide a modulo technique that can be applied to coded distributed matrix multiplication problems to reduce the recovery threshold. This technique exploits the special structure of interpolation points and can be applied to many existing coded matrix designs. Recently studied discrete Fourier transform based code achieves a smaller recovery threshold than the optimal MatDot code with the expense that it cannot resist stragglers. We also propose a distributed matrix multiplication scheme based on the idea of locally repairable code to reduce the recovery threshold of MatDot code and provide resilience to stragglers. We also apply our constructions to a type of matrix computing problems, where generalized linear models act as a special case.
翻译:本文研究分布式计算两个大规模矩阵$\mathbf{A}$与$\mathbf{B}$乘积的问题。我们提出一种模数技术,可应用于编码分布式矩阵乘法问题以降低恢复阈值。该技术利用插值点的特殊结构,能够适用于多种现有编码矩阵设计。近期研究的基于离散傅里叶变换的编码虽能实现比最优MatDot码更低的恢复阈值,但代价是无法抵御落伍者。我们还提出一种基于局部可修复码思想的分布式矩阵乘法方案,旨在降低MatDot码的恢复阈值并提供对落伍者的鲁棒性。此外,我们将所提构造方法应用于一类矩阵计算问题,其中广义线性模型作为特例情形。