Approximate computing is a promising approach to reduce the power, delay, and area in hardware design for many error-resilient applications such as machine learning (ML) and digital signal processing (DSP) systems, in which multipliers usually are key arithmetic units. Due to the underlying architectural differences between ASICs and FPGAs, existing ASIC-based approximate multipliers do not offer symmetrical gains when they are implemented by FPGA resources. In this paper, we propose AMG, an open-source automated approximate multiplier generator for FPGAs driven by Bayesian optimization (BO) with parallel evaluation. The proposed method simplifies the exact half adders (HAs) for the initial partial product (PP) compression in a multiplier while preserving coarse-grained additions for the following accumulation. The generated multipliers can be effectively mapped to lookup tables (LUTs) and carry chains provided by modern FPGAs, reducing hardware costs with acceptable errors. Compared with 1167 multipliers from previous works, our generated multipliers can form a Pareto front with 28.70%-38.47% improvements in terms of the product of hardware cost and error on average. All source codes, reproduced multipliers, and our generated multipliers are available at https://github.com/phyzhenli/AMG.
翻译:近似计算是一种能有效降低硬件功耗、延迟和面积的有效方法,适用于机器学习(ML)和数字信号处理(DSP)系统等众多容错应用,其中乘法器通常是关键的算术单元。由于ASIC与FPGA底层架构的差异,现有的基于ASIC设计的近似乘法器在利用FPGA资源实现时无法提供对称的性能增益。本文提出AMG——一个由并行评估的贝叶斯优化(BO)驱动的开源自动近似乘法器生成器,专为FPGA设计。该方法简化了乘法器初始部分积(PP)压缩中的精确半加器(HA),同时保留后续累加中的粗粒度加法。生成的乘法器可高效映射到现代FPGA提供的查找表(LUT)和进位链,从而在可接受的误差范围内降低硬件成本。与现有工作中的1167个乘法器相比,本方法生成的乘法器在硬件成本与误差乘积指标上平均获得28.70%-38.47%的提升,并能形成帕累托前沿。所有源代码、复现的乘法器及我们生成的乘法器均可在https://github.com/phyzhenli/AMG获取。