This paper introduces Distribution-Flexible Subset Quantization (DFSQ), a post-training quantization method for super-resolution networks. Our motivation for developing DFSQ is based on the distinctive activation distributions of current super-resolution models, which exhibit significant variance across samples and channels. To address this issue, DFSQ conducts channel-wise normalization of the activations and applies distribution-flexible subset quantization (SQ), wherein the quantization points are selected from a universal set consisting of multi-word additive log-scale values. To expedite the selection of quantization points in SQ, we propose a fast quantization points selection strategy that uses K-means clustering to select the quantization points closest to the centroids. Compared to the common iterative exhaustive search algorithm, our strategy avoids the enumeration of all possible combinations in the universal set, reducing the time complexity from exponential to linear. Consequently, the constraint of time costs on the size of the universal set is greatly relaxed. Extensive evaluations of various super-resolution models show that DFSQ effectively retains performance even without fine-tuning. For example, when quantizing EDSRx2 on the Urban benchmark, DFSQ achieves comparable performance to full-precision counterparts on 6- and 8-bit quantization, and incurs only a 0.1 dB PSNR drop on 4-bit quantization. Code is at \url{https://github.com/zysxmu/DFSQ}
翻译:本文提出了分布灵活子集量化(DFSQ),一种用于超分辨率网络的后训练量化方法。开发DFSQ的动机源于当前超分辨率模型中激活值的分布特性,这些激活值在样本和通道间存在显著差异。为解决此问题,DFSQ对激活值进行通道级归一化,并应用分布灵活子集量化(SQ),其中量化点从由多字加法对数尺度值构成的通用集合中选取。为加速SQ中量化点的选择,我们提出了一种快速量化点选择策略,利用K均值聚类选取最接近聚类中心的量化点。与常见的迭代穷举搜索算法相比,该策略避免了枚举通用集合中的所有可能组合,将时间复杂度从指数级降低到线性级。因此,时间成本对通用集合大小的约束大大放宽。针对多种超分辨率模型的广泛评估表明,DFSQ即使在没有微调的情况下也能有效保持性能。例如,在对Urban基准上的EDSRx2进行量化时,DFSQ在6位和8位量化下实现了与全精度模型相当的性能,而在4位量化下仅造成0.1 dB的PSNR下降。代码见\url{https://github.com/zysxmu/DFSQ}