Mixed precision quantization has become an important technique for enabling the execution of deep neural networks (DNNs) on limited resource computing platforms. Traditional quantization methods have primarily concentrated on maintaining neural network accuracy, either ignoring the impact of quantization on the robustness of the network, or using only empirical techniques for improving robustness. In contrast, techniques for robustness certification, which can provide strong guarantees about the robustness of DNNs have not been used during quantization due to their high computation cost. This paper introduces ARQ, an innovative mixed-precision quantization method that not only preserves the clean accuracy of the smoothed classifiers but also maintains their certified robustness. ARQ uses reinforcement learning to find accurate and robust DNN quantization, while efficiently leveraging randomized smoothing, a popular class of statistical DNN verification algorithms, to guide the search process. We compare ARQ with multiple state-of-the-art quantization techniques on several DNN architectures commonly used in quantization studies: ResNet-20 on CIFAR-10, ResNet-50 on ImageNet, and MobileNetV2 on ImageNet. We demonstrate that ARQ consistently performs better than these baselines across all the benchmarks and the input perturbation levels. In many cases, the performance of ARQ quantized networks can reach that of the original DNN with floating-point weights, but with only 1.5% instructions.
翻译:混合精度量化已成为在有限资源计算平台上执行深度神经网络(DNN)的一项重要技术。传统的量化方法主要集中于保持神经网络的精度,要么忽略了量化对网络鲁棒性的影响,要么仅使用经验性技术来提高鲁棒性。相比之下,能够为DNN鲁棒性提供强有力保证的鲁棒性认证技术,由于其高昂的计算成本,在量化过程中尚未得到应用。本文介绍了ARQ,一种创新的混合精度量化方法,它不仅保持了平滑分类器的干净精度,还维持了其可认证的鲁棒性。ARQ使用强化学习来寻找精确且鲁棒的DNN量化方案,同时高效地利用随机平滑——一类流行的统计DNN验证算法——来指导搜索过程。我们在量化研究中常用的几种DNN架构上,将ARQ与多种最先进的量化技术进行了比较:在CIFAR-10上的ResNet-20、在ImageNet上的ResNet-50以及在ImageNet上的MobileNetV2。我们证明,在所有基准测试和输入扰动水平上,ARQ始终优于这些基线方法。在许多情况下,ARQ量化网络的性能可以达到原始浮点权重DNN的水平,但仅需其1.5%的指令。