With the continuous growth of neural network scales, low-precision quantization is widely used in edge accelerators. Classic multi-threshold activation hardware requires 2^n thresholds for n-bit outputs, causing a rapid increase in hardware cost as precision increases. We propose a reconfigurable activation hardware, GRAU, based on piecewise linear fitting, where the segment slopes are approximated by powers of two. Our design requires only basic comparators and 1-bit right shifters, supporting mixed-precision quantization and nonlinear functions such as SiLU. Compared with multi-threshold activators, GRAU reduces LUT consumption by over 90%, achieving higher hardware efficiency, flexibility, and scalability.
翻译:随着神经网络规模的持续增长,低精度量化在边缘加速器中得到广泛应用。经典的多阈值激活硬件需要2^n个阈值来实现n位输出,导致硬件成本随精度提升而急剧增加。我们提出了一种基于分段线性拟合的可重构激活硬件GRAU,其分段斜率通过2的幂次进行近似。该设计仅需基本比较器和1位右移器,即可支持混合精度量化及SiLU等非线性函数。与多阈值激活器相比,GRAU可降低超过90%的LUT消耗,在硬件效率、灵活性和可扩展性方面均表现出显著优势。