Generalized Lagrange Coded Computing: A Flexible Computation-Communication Tradeoff for Resilient, Secure, and Private Computation

We consider the problem of evaluating arbitrary multivariate polynomials over a massive dataset containing multiple inputs, on a distributed computing system with a master node and multiple worker nodes. Generalized Lagrange Coded Computing (GLCC) codes are proposed to simultaneously provide resiliency against stragglers who do not return computation results in time, security against adversarial workers who deliberately modify results for their benefit, and information-theoretic privacy of the dataset amidst possible collusion of workers. GLCC codes are constructed by first partitioning the dataset into multiple groups, then encoding the dataset using carefully designed interpolating polynomials, and sharing multiple encoded data points to each worker, such that interference computation results across groups can be eliminated at the master. Particularly, GLCC codes include the state-of-the-art Lagrange Coded Computing (LCC) codes as a special case, and exhibit a more flexible tradeoff between communication and computation overheads in optimizing system efficiency. Furthermore, we apply GLCC to distributed training of machine learning models, and demonstrate that GLCC codes achieve a speedup of up to $2.5\text{--}3.9\times$ over LCC codes in training time, across experiments for training image classifiers on different datasets, model architectures, and straggler patterns.

翻译：我们考虑在包含一个主节点和多个工作节点的分布式计算系统上，评估定义于海量多输入数据集上的任意多元多项式的问题。本文提出了广义拉格朗日编码计算（GLCC）码，旨在同时提供以下保障：对未能及时返回计算结果的滞后节点的弹性；对为自身利益故意篡改结果的恶意工作节点的安全性；以及在可能发生工作节点共谋的情况下，数据集的信息论隐私性。GLCC码的构造过程如下：首先将数据集划分为多个组，然后使用精心设计的插值多项式对数据集进行编码，并将多个编码数据点分配给每个工作节点，使得主节点能够消除跨组产生的干扰计算结果。特别地，GLCC码将当前最先进的拉格朗日编码计算（LCC）码作为一个特例包含在内，并在优化系统效率时展现出更灵活的计算与通信开销权衡。此外，我们将GLCC应用于机器学习模型的分布式训练，并通过在不同数据集、模型架构和滞后节点模式下训练图像分类器的实验证明，GLCC码在训练时间上相比LCC码实现了高达$2.5\text{--}3.9\times$的加速。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日