Recent work using Fully Homomorphic Encryption (FHE) has made non-interactive privacy-preserving inference of deep Convolutional Neural Networks (CNN) possible. However, the performance of these methods remain limited by their heavy reliance on bootstrapping, a costly FHE operation applied across multiple layers, severely slowing inference. Moreover, they depend on high-degree polynomial approximations of non-linear activations, which increase multiplicative depth and reduce accuracy by 2-5% compared to plaintext ReLU models. In this work, we close the accuracy gap between FHE-based non-interactive CNNs and their plaintext counterparts, while also achieving faster inference than existing methods. We propose a quadratic polynomial approximation of ReLU, which achieves the theoretical minimum multiplicative depth for non-linear activations, together with a penalty-based training strategy. We further introduce structural optimizations that reduce the required FHE levels in CNNs by a factor of five compared to prior work, allowing us to run deep CNN models under leveled FHE without bootstrapping. To further accelerate inference and recover accuracy typically lost with polynomial approximations, we introduce parameter clustering along with a joint strategy of data layout and ensemble techniques. Experiments with VGG and ResNet models on CIFAR and Tiny-ImageNet datasets show that our approach achieves up to $4\times$ faster private inference than prior work, with accuracy comparable to plaintext ReLU models.
翻译:近期利用全同态加密(FHE)的研究已实现深度卷积神经网络(CNN)的非交互式隐私保护推理。然而,现有方法的性能仍受限于对自举操作的重度依赖——这种代价高昂的FHE运算需跨多个网络层执行,严重拖慢推理速度。此外,这些方法依赖于对非线性激活函数的高次多项式近似,与明文ReLU模型相比,这不仅增加了乘法深度,还会导致2-5%的精度损失。本研究在缩小基于FHE的非交互式CNN与明文模型间精度差距的同时,实现了比现有方法更快的推理速度。我们提出ReLU的二次多项式近似方案(达到非线性激活函数理论最小乘法深度)与基于惩罚的训练策略。进一步引入结构优化技术,将CNN所需的FHE层级较先前工作降低五倍,使得深度CNN模型可在无需自举操作的层次化FHE环境下运行。为加速推理并恢复多项式近似通常带来的精度损失,我们提出参数聚类技术,并结合数据布局与集成学习的联合策略。在CIFAR和Tiny-ImageNet数据集上对VGG和ResNet模型的实验表明,本方法在达到与明文ReLU模型相当精度的同时,实现了比现有工作快达$4\times$的隐私保护推理速度。