Recent research has found that the activation function (AF) selected for adding non-linearity into the output can have a big impact on how effectively deep learning networks perform. Developing activation functions that can adapt simultaneously with learning is a need of time. Researchers recently started developing activation functions that can be trained throughout the learning process, known as trainable, or adaptive activation functions (AAF). Research on AAF that enhance the outcomes is still in its early stages. In this paper, a novel activation function 'ErfReLU' has been developed based on the erf function and ReLU. This function exploits the ReLU and the error function (erf) to its advantage. State of art activation functions like Sigmoid, ReLU, Tanh, and their properties have been briefly explained. Adaptive activation functions like Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf have also been described. Lastly, performance analysis of 9 trainable activation functions along with the proposed one namely Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf has been shown by applying these activation functions in MobileNet, VGG16, and ResNet models on CIFAR-10, MNIST, and FMNIST benchmark datasets.
翻译:近期研究发现,为输出引入非线性的激活函数选择对深度学习网络的性能影响显著。开发能够与学习过程同步自适应的激活函数已成为当务之急。研究者近期开始探索可在学习过程中训练的激活函数,即训练型或自适应激活函数。提升模型性能的自适应激活函数研究仍处于早期阶段。本文基于erf函数与ReLU提出新型激活函数'ErfReLU',该函数充分利用ReLU与误差函数的优势。本文简要阐述了Sigmoid、ReLU、Tanh等主流激活函数及其特性,并介绍了Tanhsoft1、Tanhsoft2、Tanhsoft3、TanhLU、SAAF、ErfAct、Pserf、Smish、Serf等自适应激活函数。最后,通过将上述9种可训练激活函数与所提函数应用于CIFAR-10、MNIST和FMNIST基准数据集上的MobileNet、VGG16、ResNet模型,展示了性能对比分析结果。