HAAC: A Hardware-Software Co-Design to Accelerate Garbled Circuits

Privacy and security have rapidly emerged as priorities in system design. One powerful solution for providing both is privacy-preserving computation, where functions are computed directly on encrypted data and control can be provided over how data is used. Garbled circuits (GCs) are a PPC technology that provide both confidential computing and control over how data is used. The challenge is that they incur significant performance overheads compared to plaintext. This paper proposes a novel garbled circuits accelerator and compiler, named HAAC, to mitigate performance overheads and make privacy-preserving computation more practical. HAAC is a hardware-software co-design. GCs are exemplars of co-design as programs are completely known at compile time, i.e., all dependence, memory accesses, and control flow are fixed. The design philosophy of HAAC is to keep hardware simple and efficient, maximizing area devoted to our proposed custom execution units and other circuits essential for high performance (e.g., on-chip storage). The compiler can leverage its program understanding to realize hardware's performance potential by generating effective instruction schedules, data layouts, and orchestrating off-chip events. In taking this approach we can achieve ASIC performance/efficiency without sacrificing generality. Insights of our approach include how co-design enables expressing arbitrary GCs programs as streams, which simplifies hardware and enables complete memory-compute decoupling, and the development of a scratchpad that captures data reuse by tracking program execution, eliminating the need for costly hardware managed caches and tagging logic. % We evaluate HAAC with VIP-Bench and achieve a speedup of 608$\times$ in 4.3mm$^2$ of area. We evaluate HAAC with VIP-Bench and achieve an average speedup of 589$\times$ with DDR4 (2,627$\times$ with HBM2) in 4.3mm$^2$ of area.

翻译：隐私与安全已迅速成为系统设计的优先考量。隐私保护计算为此提供了强有力的解决方案——它允许直接在加密数据上执行函数运算，并能控制数据的使用方式。加密电路（GC）作为隐私保护计算技术之一，既能实现机密计算，又能管控数据使用方式。其挑战在于相较明文计算存在显著性能开销。本文提出一种新型加密电路加速器与编译器HAAC，旨在降低性能开销，使隐私保护计算更具实用性。HAAC采用软硬件协同设计架构。加密电路是软硬件协同设计的典型范例：其程序在编译阶段完全确定，即所有依赖关系、内存访问及控制流均固定不变。HAAC的设计哲学是保持硬件简洁高效，将芯片面积最大化用于定制执行单元及其他高性能关键电路（如片上存储）。编译器可利用对程序的理解，通过生成高效的指令调度方案、数据布局策略并协调片外事件，充分释放硬件性能潜力。采用此方案，我们可在不牺牲通用性的前提下达到专用集成电路（ASIC）的性能效率。本方案的核心洞见包括：通过软硬件协同设计实现任意加密电路程序的流式表达，从而简化硬件并实现内存计算完全解耦；以及开发一种通过追踪程序执行来捕获数据重用的暂存器，消除对昂贵硬件管理缓存和标签逻辑的需求。我们基于VIP-Bench基准程序对HAAC进行评估，在4.3mm²面积上实现608倍加速比。采用DDR4内存时平均加速比达589倍（采用HBM2内存时达2,627倍），芯片面积为4.3mm²。