Approximated Coded Computing: Towards Fast, Private and Secure Distributed Machine Learning

In a large-scale distributed machine learning system, coded computing has attracted wide-spread attention since it can effectively alleviate the impact of stragglers. However, several emerging problems greatly limit the performance of coded distributed systems. Firstly, an existence of colluding workers who collude results with each other leads to serious privacy leakage issues. Secondly, there are few existing works considering security issues in data transmission of distributed computing systems. Thirdly, the number of required results for which need to wait increases with the degree of decoding functions. In this paper, we design a secure and private approximated coded distributed computing (SPACDC) scheme that deals with the above-mentioned problems simultaneously. Our SPACDC scheme guarantees data security during the transmission process using a new encryption algorithm based on elliptic curve cryptography. Especially, the SPACDC scheme does not impose strict constraints on the minimum number of results required to be waited for. An extensive performance analysis is conducted to demonstrate the effectiveness of our SPACDC scheme. Furthermore, we present a secure and private distributed learning algorithm based on the SPACDC scheme, which can provide information-theoretic privacy protection for training data. Our experiments show that the SPACDC-based deep learning algorithm achieves a significant speedup over the baseline approaches.

翻译：在大规模分布式机器学习系统中，编码计算因其能有效缓解掉队者影响而受到广泛关注。然而，若干新问题严重限制了编码分布式系统的性能：首先，存在相互串通的合谋工作者会导致严重的隐私泄露问题；其次，现有工作极少考虑分布式计算系统中数据传输的安全性；第三，所需等待的结果数量随解码函数阶数增加而增加。本文设计了一种同时应对上述问题的安全隐私近似编码分布式计算（SPACDC）方案。该方案采用基于椭圆曲线密码学的新型加密算法，保障传输过程中的数据安全。特别地，SPACDC方案对需要等待的最小结果数量不施加严格约束。通过全面的性能分析验证了SPACDC方案的有效性。此外，我们基于SPACDC方案提出了安全隐私分布式学习算法，可为训练数据提供信息论意义上的隐私保护。实验表明，基于SPACDC的深度学习算法相较基准方法实现了显著加速。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日