Neural Networks (NNs) have been widely adopted due to their outstanding efficacy and adaptability across computer vision and deep learning applications. The optimization of NNs is necessary to enable their deployment on energy constrained embedded devices, where the limited available energy poses a significant challenge for efficient inference. This paper presents a runtime reconfigurable multiplier architecture integrated into the RISC-V core, targeting energy efficient neural network inference and edge AI applications. The proposed multiplier supports adaptability for exact and approximate computation with multiple configurable accuracy levels via a dedicated mulscr, enabling fine-grained energy accuracy control within a standard processor pipeline. The proposed design achieves 44%-52% and 62%-68% power reduction in exact and approximate modes respectively, while maintaining the computational performance of 1.89 DMIPS/MHz. Evaluations on error-tolerant workloads including 2d convolution and matrix multiplication demonstrate up to 63% reduction in energy consumption, with the proposed design achieving 1.21 pJ/instruction for matrix multiplication, confirming its effectiveness for energy-constrained edge AI deployments.
翻译:神经网络因其在计算机视觉和深度学习领域的卓越效能和适应性而被广泛采用。为在能量受限的嵌入式设备上部署神经网络,需要对网络进行优化,因为有限的可用能量对高效推断构成了重大挑战。本文提出了一种集成于RISC-V核心内的运行时 可重构乘法器架构,旨在实现能效优化的神经网络推断与边缘AI应用。所提出的乘法器通过专用mulscr指令支持精确计算与近似计算的自适应切换,并具有多个可配置的精度等级,从而在标准处理器流水线内实现精细粒度的能量-精度控制。该设计在精确模式和近似模式下分别实现了44%-52%和62%-68%的功耗降低,同时保持了1.89 DMIPS/MHz的计算性能。针对包括二维卷积和矩阵乘法在内的容错工作负载的评估表明,所提设计将能量消耗降低了高达63%,其中矩阵乘法的能耗达到1.21 pJ/指令,充分验证了其在能量受限的边缘AI部署场景中的有效性。