The adoption of machine learning solutions is rapidly increasing across all parts of society. As the models grow larger, both training and inference of machine learning models is increasingly outsourced, e.g. to cloud service providers. This means that potentially sensitive data is processed on untrusted platforms, which bears inherent data security and privacy risks. In this work, we investigate how to protect distributed machine learning systems, focusing on deep convolutional neural networks. The most common and best-performing mixed MPC approaches are based on HE, secret sharing, and garbled circuits. They commonly suffer from large performance overheads, big accuracy losses, and communication overheads that grow linearly in the depth of the neural network. To improve on these problems, we present Dash, a fast and distributed private convolutional neural network inference scheme secure against malicious attackers. Building on arithmetic garbling gadgets [BMR16] and fancy-garbling [BCM+19], Dash is based purely on arithmetic garbled circuits. We introduce LabelTensors that allow us to leverage the massive parallelity of modern GPUs. Combined with state-of-the-art garbling optimizations, Dash outperforms previous garbling approaches up to a factor of about 100. Furthermore, we introduce an efficient scaling operation over the residues of the Chinese remainder theorem representation to arithmetic garbled circuits, which allows us to garble larger networks and achieve much higher accuracy than previous approaches. Finally, Dash requires only a single communication round per inference step, regardless of the depth of the neural network, and a very small constant online communication volume.
翻译:机器学习解决方案的采纳正在社会各领域迅速增长。随着模型规模不断扩大,机器学习模型的训练与推理日益被外包(例如至云服务提供商)。这意味着潜在敏感数据在不可信平台上处理,带来了固有的数据安全与隐私风险。本研究探讨如何保护分布式机器学习系统,重点关注深度卷积神经网络。当前最常用且性能最优的混合多方计算方案主要基于同态加密、秘密共享和混淆电路技术,但普遍存在性能开销大、精度损失显著以及通信开销随神经网络深度线性增长等问题。为改善这些缺陷,我们提出Dash——一种针对恶意攻击者安全的快速分布式私有卷积神经网络推理方案。Dash基于算术混淆电路构建,融合了算术混淆组件[BMR16]与高级混淆技术[BCM+19]。我们引入LabelTensors数据结构,使其能够充分利用现代GPU的大规模并行计算能力。结合最先进的混淆优化技术,Dash的性能较以往混淆方案提升约100倍。此外,我们提出基于中国剩余定理表示残差的算术混淆电路高效缩放运算,该技术使得我们能够混淆更大规模的网络,并获得远优于先前方案的推理精度。最终,Dash在每次推理步骤中仅需单轮通信(与神经网络深度无关),且在线通信量保持极小的常数级别。