The softmax function is used as an activation function placed in the output layer of a neural network. It allows extracting the probabilities of the output classes, while introduces a non-linearity to the model. In the field of low-end FPGAs, implementations of Deep Neural Networks (DNNs) require the exploration of optimisation techniques to improve computational efficiency and hardware resource consumption. This work explores approximate computing techniques to implement the softmax function, using Taylor and Pad\'e approximations, and interpolation methods with Look-Up Tables (LUTs). The introduction of approximations aims to reduce the required execution time while reducing the precision of results produced by the softmax function. Each implementation is evaluated using Root Mean Square Error (RMSE) for accuracy assessment, and individual performance is verified by taking measurements of execution times. From our evaluation, quadratic interpolation with LUTs achieves the lowest error, but in terms of performance, Taylor and Pad\'e approximations show better execution times, which highlights the existing design trade-off between numerical accuracy and power consumption.
翻译:Softmax函数作为一种激活函数,通常置于神经网络的输出层。它能够提取输出类别的概率分布,同时为模型引入非线性特性。在低端FPGA应用领域,深度神经网络(DNN)的实现需要探索优化技术以提升计算效率并降低硬件资源消耗。本研究探索了采用泰勒近似与帕德近似,以及结合查找表(LUT)的插值方法来实现Softmax函数的近似计算技术。引入近似计算的目标是在降低Softmax函数输出结果精度的同时减少所需的执行时间。每种实现方案均通过均方根误差(RMSE)进行精度评估,并通过测量执行时间来验证各自的性能表现。根据我们的评估结果,采用查找表的二次插值方法实现了最低的误差,但在性能表现方面,泰勒近似与帕德近似显示出更优的执行时间,这凸显了数值精度与功耗之间存在的设计权衡关系。