Privacy-preserving computation techniques like homomorphic encryption (HE) and secure multi-party computation (SMPC) enhance data security by enabling processing on encrypted data. However, the significant computational and CPU-DRAM data movement overhead resulting from the underlying cryptographic algorithms impedes the adoption of these techniques in practice. Existing approaches focus on improving computational overhead using specialized hardware like GPUs and FPGAs, but these methods still suffer from the same processor-DRAM overhead. Novel hardware technologies that support in-memory processing have the potential to address this problem. Memory-centric computing, or processing-in-memory (PIM), brings computation closer to data by introducing low-power processors called data processing units (DPUs) into memory. Besides its in-memory computation capability, PIM provides extensive parallelism, resulting in significant performance improvement over state-of-the-art approaches. We propose a framework that uses recently available PIM hardware to achieve efficient privacy-preserving computation. Our design consists of a four-layer architecture: (1) an application layer that decouples privacy-preserving applications from the underlying protocols and hardware; (2) a protocol layer that implements existing secure computation protocols (HE and MPC); (3) a data orchestration layer that leverages data compression techniques to mitigate the data transfer overhead between DPUs and host memory; (4) a computation layer which implements DPU kernels on which secure computation algorithms are built.
翻译:同态加密(HE)和安全多方计算(SMPC)等隐私保护计算技术通过对加密数据进行处理来增强数据安全性。然而,底层密码算法带来的巨大计算开销以及CPU与DRAM之间的数据移动开销,阻碍了这些技术在实际中的应用。现有方法侧重于利用GPU和FPGA等专用硬件来改善计算开销,但这些方法仍受限于处理器与DRAM之间的相同传输开销。支持内存内处理的新型硬件技术有望解决这一问题。内存中心计算,即内存内处理(PIM),通过将称为数据处理单元(DPU)的低功耗处理器引入内存,使计算更贴近数据。除了其内存内计算能力外,PIM还提供了高度的并行性,从而相比现有先进方法实现了显著的性能提升。我们提出了一个利用近期可用的PIM硬件实现高效隐私保护计算的框架。我们的设计包含四层架构:(1)应用层,将隐私保护应用与底层协议及硬件解耦;(2)协议层,实现现有的安全计算协议(HE与MPC);(3)数据编排层,利用数据压缩技术减轻DPU与主机内存之间的数据传输开销;(4)计算层,实现构建安全计算算法所需的DPU内核。