There is a recent trend in artificial intelligence (AI) inference towards lower precision data formats down to 8 bits and less. As multiplication is the most complex operation in typical inference tasks, there is a large demand for efficient small multipliers. The large DSP blocks have limitations implementing many small multipliers efficiently. Hence, this work proposes a solution for better logic-based multipliers that is especially beneficial for small multipliers. Our work is based on the multiplier tiling method in which a multiplier is designed out of several sub-multiplier tiles. The key observation we made is that these sub-multipliers do not necessarily have to perform a complete (rectangular) NxK multiplication and more efficient sub-multipliers are possible that are incomplete (non-rectangular). This proposal first seeks to identify efficient incomplete irregular sub-multipliers and then demonstrates improvements over state-of-the-art designs. It is shown that optimal solutions can be found using integer linear programming (ILP), which are evaluated in FPGA synthesis experiments.
翻译:近期,人工智能(AI)推理领域呈现出向低精度数据格式发展的趋势,精度可低至8位甚至更少。由于乘法运算是典型推理任务中最复杂的操作,因此对高效小型乘法器的需求十分迫切。大型DSP模块在高效实现众多小型乘法器方面存在局限性。为此,本文提出了一种更优的基于逻辑的乘法器解决方案,尤其适用于小型乘法器。我们的研究基于乘法器分块方法,即由若干子乘法器块构成一个乘法器。关键发现是,这些子乘法器不必执行完整的(矩形)N×K乘法运算,采用不完整的(非矩形)子乘法器能实现更高效率。本方案首先致力于识别高效的不完整不规则子乘法器,随后展示了相较于现有先进设计的改进效果。研究表明,可通过整数线性规划(ILP)找到最优解,并在FPGA综合实验中进行了评估。