Coded distributed computing (CDC) was introduced to greatly reduce the communication load for MapReduce computing systems. Such a system has $K$ nodes, $N$ input files, and $Q$ Reduce functions. Each input file is mapped by $r$ nodes and each Reduce function is computed by $s$ nodes. The architecture must allow for coding techniques that achieve the maximum multicast gain. Some CDC schemes that achieve optimal communication load have been proposed before. The parameters $N$ and $Q$ in those schemes, however, grow too fast with respect to $K$ to be of great practical value. To improve the situation, researchers have come up with some asymptotically optimal cascaded CDC schemes with $s+r=K$ from symmetric designs. In this paper, we propose new asymptotically optimal cascaded CDC schemes. Akin to known schemes, ours have $r+s=K$ and make use of symmetric designs as construction tools. Unlike previous schemes, ours have much smaller communication loads, given the same set of parameters $K$, $r$, $N$, and $Q$. We also expand the construction tools to include almost difference sets. Using them, we have managed to construct a new asymptotically optimal cascaded CDC scheme.
翻译:编码分布式计算(CDC)被引入以大幅降低MapReduce计算系统的通信负载。该系统包含$K$个节点、$N$个输入文件和$Q$个Reduce函数。每个输入文件由$r$个节点映射,每个Reduce函数由$s$个节点计算。该架构必须支持能够实现最大多播增益的编码技术。此前已提出一些达到最优通信负载的CDC方案。然而,这些方案中参数$N$和$Q$随$K$增长过快,缺乏显著实用价值。为改善这一状况,研究人员利用对称设计提出了若干满足$s+r=K$的渐近最优级联CDC方案。本文提出新的渐近最优级联CDC方案。与已知方案类似,我们的方案满足$r+s=K$,并采用对称设计作为构造工具。与先前方案不同,在给定相同参数集$K$、$r$、$N$和$Q$时,我们的方案具有更小的通信负载。我们还将构造工具扩展至包含几乎差集。利用这些工具,我们成功构建了一种新的渐近最优级联CDC方案。