TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

The application of the deep learning model in classification plays an important role in the accurate detection of the target objects. However, the accuracy is affected by the activation function in the hidden and output layer. In this paper, an activation function called TaLU, which is a combination of Tanh and Rectified Linear Units (ReLU), is used to improve the prediction. ReLU activation function is used by many deep learning researchers for its computational efficiency, ease of implementation, intuitive nature, etc. However, it suffers from a dying gradient problem. For instance, when the input is negative, its output is always zero because its gradient is zero. A number of researchers used different approaches to solve this issue. Some of the most notable are LeakyReLU, Softplus, Softsign, ELU, ThresholdedReLU, etc. This research developed TaLU, a modified activation function combining Tanh and ReLU, which mitigates the dying gradient problem of ReLU. The deep learning model with the proposed activation function was tested on MNIST and CIFAR-10, and it outperforms ReLU and some other studied activation functions in terms of accuracy(upto 6% in most cases, when used with Batch Normalization and a reasonable learning rate).

翻译：深度学习模型在分类任务中的应用对目标对象的精确检测至关重要。然而，隐藏层和输出层中的激活函数会影响其准确率。本文采用一种名为TaLU的激活函数——该函数结合了双曲正切（Tanh）与修正线性单元（ReLU）——来提升预测性能。众多深度学习研究者使用ReLU激活函数，因其计算效率高、易于实现、直观性强等优点，但该函数存在梯度消失问题。例如，当输入为负值时，其输出恒为零，因其梯度亦为零。先前研究者采用不同方法解决此问题，其中较具代表性的包括LeakyReLU、Softplus、Softsign、ELU、ThresholdedReLU等。本研究开发的TaLU是一种改进型激活函数，通过融合Tanh与ReLU，有效缓解了ReLU的梯度消失问题。采用该激活函数的深度学习模型在MNIST与CIFAR-10数据集上进行测试，在准确率方面（大多数情况下配合批量归一化与合理学习率可提升高达6%）优于ReLU及其他若干被研究激活函数。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

84+阅读 · 2022年3月19日

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日