Optimizing Layerwise Polynomial Approximation for Efficient Private Inference on Fully Homomorphic Encryption: A Dynamic Programming Approach

Recent research has explored the implementation of privacy-preserving deep neural networks solely using fully homomorphic encryption. However, its practicality has been limited because of prolonged inference times. When using a pre-trained model without retraining, a major factor contributing to these prolonged inference times is the high-degree polynomial approximation of activation functions such as the ReLU function. The high-degree approximation consumes a substantial amount of homomorphic computational resources, resulting in slower inference. Unlike the previous works approximating activation functions uniformly and conservatively, this paper presents a \emph{layerwise} degree optimization of activation functions to aggressively reduce the inference time while maintaining classification accuracy by taking into account the characteristics of each layer. Instead of the minimax approximation commonly used in state-of-the-art private inference models, we employ the weighted least squares approximation method with the input distributions of activation functions. Then, we obtain the layerwise optimized degrees for activation functions through the \emph{dynamic programming} algorithm, considering how each layer's approximation error affects the classification accuracy of the deep neural network. Furthermore, we propose modulating the ciphertext moduli-chain layerwise to reduce the inference time. By these proposed layerwise optimization methods, we can reduce inference times for the ResNet-20 model and the ResNet-32 model by 3.44 times and 3.16 times, respectively, in comparison to the prior implementations employing uniform degree polynomials and a consistent ciphertext modulus.

翻译：近期研究探索了仅使用全同态加密实现隐私保护深度神经网络的可能性，但漫长的推理时间限制了其实用性。当使用无需再训练的预训练模型时，导致推理时间延长的主要因素之一是激活函数（如ReLU函数）的高次多项式近似。高次近似消耗大量同态计算资源，导致推理速度变慢。与以往对激活函数采用统一保守近似的方法不同，本文提出一种逐层激活函数次数优化策略，通过考虑各层特征，在保持分类精度的前提下显著减少推理时间。我们摒弃了当前最先进隐私推理模型中常用的极小极大近似方法，转而采用基于激活函数输入分布的加权最小二乘近似方法。随后，通过动态规划算法获得逐层优化后的激活函数次数，该算法考虑了各层近似误差对深度神经网络分类精度的影响。此外，我们还提出了逐层调整密文模数链的方法以进一步缩短推理时间。通过上述逐层优化方法，与采用统一次数多项式与恒定密文模数的现有实现相比，ResNet-20和ResNet-32模型的推理时间分别减少了3.44倍和3.16倍。