Leveraging ASIC AI Chips for Homomorphic Encryption

Cloud-based services are making the outsourcing of sensitive client data increasingly common. Although homomorphic encryption (HE) offers strong privacy guarantee, it requires substantially more resources than computing on plaintext, often leading to unacceptably large latencies in getting the results. HE accelerators have emerged to mitigate this latency issue, but with the high cost of ASICs. In this paper we show that HE primitives can be converted to AI operators and accelerated on existing ASIC AI accelerators, like TPUs, which are already widely deployed in the cloud. Adapting such accelerators for HE requires (1) supporting modular multiplication, (2) high-precision arithmetic in software, and (3) efficient mapping on matrix engines. We introduce the CROSS compiler (1) to adopt Barrett reduction to provide modular reduction support using multiplier and adder, (2) Basis Aligned Transformation (BAT) to convert high-precision multiplication as low-precision matrix-vector multiplication, (3) Matrix Aligned Transformation (MAT) to covert vectorized modular operation with reduction into matrix multiplication that can be efficiently processed on 2D spatial matrix engine. Our evaluation of CROSS on a Google TPUv4 demonstrates significant performance improvements, with up to 161x and 5x speedup compared to the previous work on many-core CPUs and V100. The kernel-level codes are open-sourced at https://github.com/google/jaxite/tree/main/jaxite_word.

翻译：基于云的服务使得外包敏感客户数据日益普遍。虽然同态加密（HE）提供了强大的隐私保障，但其所需资源远高于明文计算，通常导致获取结果时产生无法接受的巨大延迟。为缓解此延迟问题，已出现HE加速器，但伴随而来的是ASIC的高昂成本。本文证明，HE原语可转换为AI算子，并在现有已广泛部署于云端的ASIC AI加速器（如TPU）上获得加速。为此类加速器适配HE需满足：（1）支持模乘运算，（2）软件中的高精度算术运算，以及（3）在矩阵引擎上的高效映射。我们引入CROSS编译器：（1）采用Barrett约简，利用乘法器和加法器提供模约简支持；（2）通过基对齐变换（BAT）将高精度乘法转换为低精度矩阵-向量乘法；（3）通过矩阵对齐变换（MAT）将带约简的向量化模运算转换为可在二维空间矩阵引擎上高效处理的矩阵乘法。我们在Google TPUv4上对CROSS的评估展示了显著的性能提升，与先前在多核CPU和V100上的工作相比，最高分别实现了161倍和5倍的加速。内核级代码已在 https://github.com/google/jaxite/tree/main/jaxite_word 开源。

相关内容

关注 7103

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日