This paper introduces Distribution-Flexible Subset Quantization (DFSQ), a post-training quantization method for super-resolution networks. Our motivation for developing DFSQ is based on the distinctive activation distributions of current super-resolution models, which exhibit significant variance across samples and channels. To address this issue, DFSQ conducts channel-wise normalization of the activations and applies distribution-flexible subset quantization (SQ), wherein the quantization points are selected from a universal set consisting of multi-word additive log-scale values. To expedite the selection of quantization points in SQ, we propose a fast quantization points selection strategy that uses K-means clustering to select the quantization points closest to the centroids. Compared to the common iterative exhaustive search algorithm, our strategy avoids the enumeration of all possible combinations in the universal set, reducing the time complexity from exponential to linear. Consequently, the constraint of time costs on the size of the universal set is greatly relaxed. Extensive evaluations of various super-resolution models show that DFSQ effectively retains performance even without fine-tuning. For example, when quantizing EDSRx2 on the Urban benchmark, DFSQ achieves comparable performance to full-precision counterparts on 6- and 8-bit quantization, and incurs only a 0.1 dB PSNR drop on 4-bit quantization.
翻译:本文提出分布灵活子集量化(DFSQ),一种针对超分辨率网络的后训练量化方法。开发DFSQ的动机源于当前超分辨率模型独特的激活分布,这些分布在样本和通道间表现出显著差异。为解决该问题,DFSQ对激活值进行逐通道归一化,并应用分布灵活子集量化(SQ),其中量化点从由多字加性对数尺度值构成的通用集合中选取。为加速SQ中量化点的选择,我们提出一种快速量化点选择策略,利用K-means聚类选取最接近质心的量化点。与常见的迭代穷举搜索算法相比,该策略避免了枚举通用集合中所有可能组合,将时间复杂度从指数级降低至线性级。因此,时间成本对通用集合规模的约束得到极大缓解。对多种超分辨率模型的广泛评估表明,即使不进行微调,DFSQ也能有效保持性能。例如,在Urban基准上对EDSRx2进行量化时,DFSQ在6位和8位量化下达到了与全精度模型相当的性能,且在4位量化下仅造成0.1 dB的PSNR下降。