Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast convolution by extending the Discrete Fourier Transform (DFT) with symbolic computing, in which only additions are required to perform the transformation at specific transform points, avoiding the calculation of irrational number and reducing the requirement for precision. Additionally, we enhance convolution efficiency by introducing correction terms to convert invalid circular convolution outputs of the Fourier method into effective ones. The numerical error analysis is presented for the first time in this type of work and proves that our algorithms can provide a 3.68x multiplication reduction for 3x3 convolution, while the Winograd algorithm only achieves a 2.25x reduction with similarly low numerical errors. Experiments carried out on benchmarks and FPGA show that our new algorithms can further improve the computation efficiency of quantized models while maintaining accuracy, surpassing both the quantization-alone method and existing works on fast convolution quantization.
翻译:快速卷积算法(包括Winograd和FFT)能有效加速深度模型中的卷积运算。然而,这些算法依赖高精度算术以保持推理精度,这与模型量化存在矛盾。为解决这一矛盾并进一步提升量化卷积的效率,本文提出SFC——一种通过符号计算扩展离散傅里叶变换(DFT)的新型快速卷积代数变换。该算法在特定变换点仅需加法运算即可完成变换,避免了无理数计算并降低了对精度的要求。此外,我们通过引入校正项将傅里叶方法无效的循环卷积输出转换为有效输出,从而提升卷积效率。本研究首次给出了此类工作的数值误差分析,证明我们的算法能为3×3卷积提供3.68倍的乘法运算缩减,而Winograd算法在相似低数值误差条件下仅实现2.25倍缩减。在基准测试和FPGA上进行的实验表明,新算法能在保持精度的同时进一步提升量化模型的计算效率,其性能超越单纯量化方法及现有快速卷积量化研究成果。