We consider a wireless distributed computing system based on the MapReduce framework, which consists of three phases: \textit{Map}, \textit{Shuffle}, and \textit{Reduce}. The system consists of a set of distributed nodes assigned to compute arbitrary output functions depending on a file library. The computation of the output functions is decomposed into Map and Reduce functions, and the Shuffle phase, which involves the data exchange, links the two. In our model, the Shuffle phase communication happens over a full-duplex wireless interference channel. For this setting, a coded wireless MapReduce distributed computing scheme exists in the literature, achieving optimal performance under one-shot linear schemes. However, the scheme requires the number of input files to be very large, growing exponentially with the number of nodes. We present schemes that require the number of files to be in the order of the number of nodes and achieve the same performance as the existing scheme. The schemes are obtained by designing a structure called wireless MapReduce array that succinctly represents all three phases in a single array. The wireless MapReduce arrays can also be obtained from the extended placement delivery arrays known for multi-antenna coded caching schemes.
翻译:我们考虑一个基于MapReduce框架的无线分布式计算系统,该系统包含三个阶段:\textit{Map}、\textit{Shuffle}和\textit{Reduce}。该系统由一组分布式节点组成,这些节点被分配计算依赖于文件库的任意输出函数。输出函数的计算被分解为Map函数和Reduce函数,而涉及数据交换的Shuffle阶段将两者连接起来。在我们的模型中,Shuffle阶段的通信通过全双工无线干扰信道进行。针对此设置,现有文献中已存在一种编码无线MapReduce分布式计算方案,该方案在单次线性方案下实现了最优性能。然而,该方案要求输入文件的数量非常大,且随节点数量呈指数级增长。我们提出了仅要求文件数量与节点数量同阶的方案,并实现了与现有方案相同的性能。这些方案是通过设计一种称为无线MapReduce阵列的结构而获得的,该结构在一个阵列中简洁地表示了所有三个阶段。无线MapReduce阵列也可以从已知用于多天线编码缓存方案的扩展放置传递阵列中获得。