Coded distributed computing, proposed by Li et al., offers significant potential for reducing the communication load in MapReduce computing systems. In the setting of the \emph{cascaded} coded distributed computing that consisting of $K$ nodes, $N$ input files, and $Q$ output functions, the objective is to compute each output function through $s\geq 1$ nodes with a computation load $r\geq 1$, enabling the application of coding techniques during the Shuffle phase to achieve minimum communication load. However, for most existing coded distributed computing schemes, a major limitation lies in their demand for splitting the original data into an exponentially growing number of input files in terms of $N/\binom{K}{r} \in\mathbb{N}$ and requiring an exponentially large number of output functions $Q/\binom{K}{s} \in\mathbb{N}$, which imposes stringent requirements for implementation and results in significant coding complexity when $K$ is large. In this paper, we focus on the cascaded case of $K/s\in\mathbb{N} $, deliberately designing the strategy of input files store and output functions assignment based on a grouping method, such that a low-complexity two-round Shuffle phase is available. The main advantages of our proposed scheme contains: 1) the communication load is quilt close to or surprisingly better than the optimal state-of-the-art scheme proposed by Li et al.; 2) our scheme requires significantly less number of input files and output functions; 3) all the operations are implemented over the minimum binary field $\mathbb{F}_2$.
翻译:由Li等人提出的编码分布式计算,在MapReduce计算系统中具有显著降低通信负载的潜力。在包含$K$个节点、$N$个输入文件和$Q$个输出函数的级联编码分布式计算场景中,目标是通过$s \geq 1$个节点以计算负载$r \geq 1$计算每个输出函数,从而在Shuffle阶段应用编码技术实现最小通信负载。然而,现有大多数编码分布式计算方案的主要局限在于:要求将原始数据拆分为数量呈指数增长的输入文件(满足$N/\binom{K}{r} \in \mathbb{N}$),并要求输出函数数量呈指数增长(满足$Q/\binom{K}{s} \in \mathbb{N}$)。当$K$较大时,这给实际部署带来严苛约束,并导致显著的编码复杂度。本文聚焦于满足$K/s \in \mathbb{N}$的级联场景,基于分组方法精心设计输入文件存储策略与输出函数分配方案,从而实现低复杂度的两轮Shuffle阶段。所提方案的主要优势包括:1) 通信负载接近或显著优于Li等人提出的最优现有方案;2) 所需输入文件与输出函数数量大幅减少;3) 所有操作均在最小二元域$\mathbb{F}_2$上实现。