Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.
翻译:非线性激活函数在卷积神经网络中至关重要。然而,迄今为止它们在频域中尚未得到充分描述。本研究探讨了常用激活函数ReLU的频谱特性。我们利用ReLU的泰勒展开推导其频域行为,证明ReLU会在信号中引入高频振荡及恒定的直流分量。进一步,我们研究了该直流分量的重要性,证明其有助于模型提取与输入频率成分相关的有效特征。理论推导辅以实验与实例验证:首先通过数值计算验证频率响应模型;随后在两个示例模型及一个实际模型中观测ReLU的频谱行为;最后通过实验探究ReLU引入的直流分量在CNN表征中的作用。结果表明,直流分量有助于使权重配置收敛至接近初始随机权重的状态。