Multiparty computation approaches to secure neural network inference commonly rely on garbled circuits for securely executing nonlinear activation functions. However, garbled circuits require excessive communication between server and client, impose significant storage overheads, and incur large runtime penalties. To reduce these costs, we propose an alternative to garbled circuits: Tabula, an algorithm based on secure lookup tables. Our approach precomputes lookup tables during an offline phase that contains the result of all possible nonlinear function calls. Because these tables incur exponential storage costs in the number of operands and the precision of the input values, we use quantization to reduce these storage costs to make this approach practical. This enables an online phase where securely computing the result of a nonlinear function requires just a single round of communication, with communication cost equal to twice the number of bits of the input to the nonlinear function. In practice our approach costs 2 bytes of communication per nonlinear function call in the online phase. Compared to garbled circuits with 8-bit quantized inputs, when computing individual nonlinear functions during the online phase, experiments show Tabula with 8-bit activations uses between $280$-$560 \times$ less communication, is over $100\times$ faster, and uses a comparable (within a factor of 2) amount of storage; compared against other state-of-the-art protocols Tabula achieves greater than $40\times$ communication reduction. This leads to significant performance gains over garbled circuits with quantized inputs during the online phase of secure inference of neural networks: Tabula reduces end-to-end inference communication by up to $9 \times$ and achieves an end-to-end inference speedup of up to $50 \times$, while imposing comparable storage and offline preprocessing costs.
翻译:安全神经网络推理的多方计算方法通常依赖混淆电路来安全执行非线性激活函数。然而,混淆电路需要服务器与客户端之间进行大量通信,产生显著的存储开销,并导致较大的运行时损耗。为降低这些成本,我们提出一种替代混淆电路的方案:Tabula,一种基于安全查找表的算法。我们的方法在离线阶段预计算包含所有可能非线性函数调用结果的查找表。由于这些表的存储成本随操作数数量和输入值精度呈指数增长,我们采用量化技术将存储成本降低至实用水平。这使得在线阶段安全计算非线性函数结果仅需单轮通信,其通信成本等于非线性函数输入位数的两倍。实际应用中,我们的方法在每个非线性函数调用的在线阶段仅需2字节通信量。与采用8位量化输入的混淆电路相比,实验表明:在线阶段计算单个非线性函数时,采用8位激活值的Tabula通信量减少280至560倍,速度提升超过100倍,且存储使用量相当(在2倍以内);相较于其他最先进协议,Tabula实现了超过40倍的通信量降低。这为采用量化输入的安全神经网络推理在线阶段带来了显著性能提升:Tabula将端到端推理通信量降低最高达9倍,实现最高50倍的端到端推理加速,同时保持相当的存储与离线预处理成本。