Efficient Pruning for Machine Learning Under Homomorphic Encryption

Ehud Aharoni,Moran Baruch,Pradip Bose,Alper Buyuktosunoglu,Nir Drucker,Subhankar Pal,Tomer Pelleg,Kanthi Sarpatwar,Hayim Shaul,Omri Soceanu,Roman Vaculin

Privacy-preserving machine learning (PPML) solutions are gaining widespread popularity. Among these, many rely on homomorphic encryption (HE) that offers confidentiality of the model and the data, but at the cost of large latency and memory requirements. Pruning neural network (NN) parameters improves latency and memory in plaintext ML but has little impact if directly applied to HE-based PPML. We introduce a framework called HE-PEx that comprises new pruning methods, on top of a packing technique called tile tensors, for reducing the latency and memory of PPML inference. HE-PEx uses permutations to prune additional ciphertexts, and expansion to recover inference loss. We demonstrate the effectiveness of our methods for pruning fully-connected and convolutional layers in NNs on PPML tasks, namely, image compression, denoising, and classification, with autoencoders, multilayer perceptrons (MLPs) and convolutional neural networks (CNNs). We implement and deploy our networks atop a framework called HElayers, which shows a 10-35% improvement in inference speed and a 17-35% decrease in memory requirement over the unpruned network, corresponding to 33-65% fewer ciphertexts, within a 2.5% degradation in inference accuracy over the unpruned network. Compared to the state-of-the-art pruning technique for PPML, our techniques generate networks with 70% fewer ciphertexts, on average, for the same degradation limit.

翻译：隐私保护机器学习（PPML）解决方案正日益普及。其中许多方案依赖于同态加密（HE），该技术能保障模型与数据的机密性，但代价是较高的延迟和内存开销。在明文机器学习中，剪枝神经网络（NN）参数可改善延迟和内存占用，但若直接应用于基于HE的PPML则收效甚微。本文提出名为HE-PEx的框架，该框架在名为"张量平铺"的打包技术基础上，集成新型剪枝方法以降低PPML推理的延迟与内存消耗。HE-PEx通过置换操作剪除冗余密文，并利用扩展机制恢复推理精度损失。我们通过自编码器、多层感知机（MLP）和卷积神经网络（CNN），在图像压缩、去噪和分类等PPML任务中，验证了所提方法对全连接层与卷积层剪枝的有效性。基于HElayers框架的实现部署表明：在推理精度损失不超过2.5%的前提下，相比未剪枝网络，推理速度提升10-35%，内存需求降低17-35%，对应密文数量减少33-65%。与现有PPML最优剪枝技术相比，在相同精度损失限制下，本方法平均可减少70%的密文数量。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日