Processing convolution layers remains a huge bottleneck for private deep convolutional neural network (CNN) inference for large datasets. To solve this issue, this paper presents a novel homomorphic convolution algorithm that provides speedup, communication cost, and storage saving. We first note that padded convolution provides the advantage of model storage saving, but it does not support channel packing, thereby increasing the amount of computation and communication. We address this limitation by proposing a novel plaintext multiplication algorithm using the Walsh-Hadamard matrix. Furthermore, we propose the optimization techniques to significantly reduce the latency of the proposed convolution by selecting the optimal encryption parameters and applying lazy reduction. It achieves 1.6-3.8x speedup and reduces the weight storage by 2000-8000x compared to the conventional convolution. When the proposed convolution is employed for CNNs like VGG-16, ResNet-20, and MobileNetV1 on ImageNet, it reduces the end-to-end latency by 1.3-2.6x, the memory usage by 2.1-7.9x and communication cost by 1.7-2.0x compared to conventional method.
翻译:处理卷积层运算仍是大规模数据集下深度卷积神经网络(CNN)私有推理的主要瓶颈。为解决该问题,本文提出一种新型同态卷积算法,可提升计算速度、降低通信成本与存储开销。我们首先指出补零卷积虽能节省模型存储空间,但无法支持通道打包技术,导致计算量与通信量增加。针对此局限,我们提出基于Walsh-Hadamard矩阵的明文乘法新算法。此外,通过选择最优加密参数并应用延迟归约技术,显著降低所提卷积方案的延迟。与常规卷积相比,本方案实现1.6-3.8倍加速比,并将权重存储量缩减2000-8000倍。当将所提卷积应用于VGG-16、ResNet-20和MobileNetV1等CNN模型在ImageNet数据集上的推理时,相比传统方法,端到端延迟降低1.3-2.6倍,内存使用量减少2.1-7.9倍,通信成本降低1.7-2.0倍。