The expressiveness of neural networks highly depends on the nature of the activation function, although these are usually assumed predefined and fixed during the training stage. Under a signal processing perspective, in this paper we present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) and adapted using backpropagation during training. This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. This is the first non-linear model for activation functions that relies on a signal processing perspective, providing high flexibility and expressiveness to the network. We contribute with insights in the explainability of the network at convergence by recovering the concept of bump, this is, the response of each activation function in the output space. Finally, through exhaustive experiments we show that the model can adapt to classification and regression tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
翻译:神经网络的表达能力高度依赖于激活函数的性质,尽管这些函数通常被假定为预定义且在训练阶段保持不变。从信号处理的角度出发,本文提出了表达性神经网络(ENN),这是一种新颖的模型,其中非线性激活函数采用离散余弦变换(DCT)进行建模,并在训练期间通过反向传播进行自适应调整。这种参数化方式保持了较少的可训练参数数量,适用于基于梯度的优化方案,并能适应不同的学习任务。这是首个基于信号处理视角的激活函数非线性模型,为网络提供了高度的灵活性和表达力。我们通过恢复"凸块"概念来贡献网络收敛时的可解释性见解,该概念即每个激活函数在输出空间中的响应。最后,通过详尽的实验表明,该模型能够适应分类和回归任务。ENN的性能优于现有最优基准,在某些场景下准确率差距超过40%。