Efficient networks, e.g., MobileNetV2, EfficientNet, etc, achieves state-of-the-art (SOTA) accuracy with lightweight computation. However, existing homomorphic encryption (HE)-based two-party computation (2PC) frameworks are not optimized for these networks and suffer from a high inference overhead. We observe the inefficiency mainly comes from the packing algorithm, which ignores the computation characteristics and the communication bottleneck of homomorphically encrypted depthwise convolutions. Therefore, in this paper, we propose Falcon, an effective dense packing algorithm for HE-based 2PC frameworks. Falcon features a zero-aware greedy packing algorithm and a communication-aware operator tiling strategy to improve the packing density for depthwise convolutions. Compared to SOTA HE-based 2PC frameworks, e.g., CrypTFlow2, Iron and Cheetah, Falcon achieves more than 15.6x, 5.1x and 1.8x latency reduction, respectively, at operator level. Meanwhile, at network level, Falcon allows for 1.4% and 4.2% accuracy improvement over Cheetah on CIFAR-100 and TinyImagenet datasets with iso-communication, respecitvely.
翻译:高效网络(如MobileNetV2、EfficientNet等)以轻量级计算实现了最先进的准确性。然而,现有基于同态加密(HE)的两方计算框架尚未针对这些网络进行优化,导致推理开销过高。我们观察到低效主要源于打包算法忽视了同态加密深度可分离卷积的计算特性与通信瓶颈。为此,本文提出Falcon——一种面向HE两方计算框架的高效密集打包算法。Falcon采用零感知贪心打包算法与通信感知算子平铺策略,以提升深度可分离卷积的打包密度。与CrypTFlow2、Iron和Cheetah等最先进的HE两方计算框架相比,Falcon在算子层面分别实现了超过15.6倍、5.1倍和1.8倍的延迟降低。同时在网络层面,在等通信量条件下,Falcon在CIFAR-100和TinyImagenet数据集上较Cheetah分别提升了1.4%和4.2%的准确率。