Although image super-resolution (SR) problem has experienced unprecedented restoration accuracy with deep neural networks, it has yet limited versatile applications due to the substantial computational costs. Since different input images for SR face different restoration difficulties, adapting computational costs based on the input image, referred to as adaptive inference, has emerged as a promising solution to compress SR networks. Specifically, adapting the quantization bit-widths has successfully reduced the inference and memory cost without sacrificing the accuracy. However, despite the benefits of the resultant adaptive network, existing works rely on time-intensive quantization-aware training with full access to the original training pairs to learn the appropriate bit allocation policies, which limits its ubiquitous usage. To this end, we introduce the first on-the-fly adaptive quantization framework that accelerates the processing time from hours to seconds. We formulate the bit allocation problem with only two bit mapping modules: one to map the input image to the image-wise bit adaptation factor and one to obtain the layer-wise adaptation factors. These bit mappings are calibrated and fine-tuned using only a small number of calibration images. We achieve competitive performance with the previous adaptive quantization methods, while the processing time is accelerated by x2000. Codes are available at https://github.com/Cheeun/AdaBM.
翻译:尽管深度神经网络使图像超分辨率(SR)问题获得了前所未有的复原精度,但由于其巨大的计算成本,通用性应用仍十分有限。由于不同输入图像面临不同的复原难度,基于输入图像自适应调整计算成本的方法(即自适应推理)已成为压缩SR网络的一种有前景的解决方案。具体而言,自适应量化位宽已成功在保持精度的同时降低了推理与内存成本。然而,尽管自适应网络具有诸多优势,现有方法仍依赖耗时的量化感知训练(需完整访问原始训练对)来学习最优位分配策略,这限制了其普适性。为此,我们首次提出一种即时自适应量化框架,将处理时间从小时级加速至秒级。我们仅用两个位映射模块来建模位分配问题:一个模块将输入图像映射为图像级位自适应因子,另一个模块获取层级的自适应因子。这些位映射通过少量校准图像进行校准与微调。我们取得了与先前自适应量化方法相当的性能,同时处理速度提升了2000倍。代码见https://github.com/Cheeun/AdaBM。