Vers: fully distributed Coded Computing System with Distributed Encoding

Coded computing has proved to be useful in distributed computing. We have observed that almost all coded computing systems studied so far consider a setup of one master and some workers. However, recently emerging technologies such as blockchain, internet of things, and federated learning introduce new requirements for coded computing systems. In these systems, data is generated in a distributed manner, so central encoding/decoding by a master is not feasible and scalable. This paper presents a fully distributed coded computing system that consists of $k\in\mathbb{N}$ data owners and $N\in\mathbb{N}$ workers, where data owners employ workers to do some computations on their data, as specified by a target function $f$ of degree $d\in\mathbb{N}$. As there is no central encoder, workers perform encoding themselves, prior to computation phase. The challenge in this system is the presence of adversarial data owners that do not know the data of honest data owners but cause discrepancies by sending different data to different workers, which is detrimental to local encodings in workers. There are at most $\beta\in\mathbb{N}$ adversarial data owners, and each sends at most $v\in\mathbb{N}$ different versions of data. Since the adversaries and their possibly colluded behavior are not known to workers and honest data owners, workers compute tags of their received data, in addition to their main computational task, and send them to data owners to help them in decoding. We introduce a tag function that allows data owners to partition workers into sets that previously had received the same data from all data owners. Then, we characterize the fundamental limit of the system, $t^*$, which is the minimum number of workers whose work can be used to correctly calculate the desired function of data of honest data owners. We show that $t^*=v^{\beta}d(K-1)+1$, and present converse and achievable proofs.

翻译：编码计算已被证明在分布式计算中具有实用性。我们观察到，目前研究的大多数编码计算系统都采用一个主节点与若干工作节点的架构。然而，新兴的区块链、物联网和联邦学习等技术对编码计算系统提出了新的要求。在这些系统中，数据以分布式方式生成，因此由主节点进行集中编码/解码既不可行也难以扩展。本文提出一个完全分布式编码计算系统，包含$k\in\mathbb{N}$个数据所有者和$N\in\mathbb{N}$个工作节点。数据所有者雇佣工作节点按照目标函数$f$（次数为$d\in\mathbb{N}$）对其数据执行计算。由于不存在集中编码器，工作节点在计算阶段之前自行执行编码。该系统的挑战在于存在敌意数据所有者——他们不了解诚实数据所有者的数据，但通过向不同工作节点发送不同数据来引发差异，这会对工作节点的本地编码造成破坏。系统中至多有$\beta\in\mathbb{N}$个敌意数据所有者，每个敌意所有者最多发送$v\in\mathbb{N}$个不同版本的数据。由于工作节点和诚实数据所有者不知道敌意节点及其可能的合谋行为，工作节点除主要计算任务外，还需对其接收的数据计算标签，并将其发送给数据所有者以辅助解码。我们引入一种标签函数，使数据所有者能够将工作节点划分为若干集合，每个集合中的工作节点此前均从所有数据所有者处接收到相同数据。随后，我们刻画了系统的基本极限$t^*$——即能用于正确计算诚实数据所有者数据所对应目标函数的最小工作节点数。我们证明$t^*=v^{\beta}d(K-1)+1$，并给出对偶证明与可达性证明。