帕德神经元：面向高效神经模型的新型神经元 (Padé Neurons for Efficient Neural Models)

Neural networks commonly employ the McCulloch-Pitts neuron model, which is a linear model followed by a point-wise non-linear activation. Various researchers have already advanced inherently non-linear neuron models, such as quadratic neurons, generalized operational neurons, generative neurons, and super neurons, which offer stronger non-linearity compared to point-wise activation functions. In this paper, we introduce a novel and better non-linear neuron model called Padé neurons (Paons), inspired by Padé approximants. Paons offer several advantages, such as diversity of non-linearity, since each Paon learns a different non-linear function of its inputs, and layer efficiency, since Paons provide stronger non-linearity in much fewer layers compared to piecewise linear approximation. Furthermore, Paons include all previously proposed neuron models as special cases, thus any neuron model in any network can be replaced by Paons. We note that there has been a proposal to employ the Padé approximation as a generalized point-wise activation function, which is fundamentally different from our model. To validate the efficacy of Paons, in our experiments, we replace classic neurons in some well-known neural image super-resolution, compression, and classification models based on the ResNet architecture with Paons. Our comprehensive experimental results and analyses demonstrate that neural models built by Paons provide better or equal performance than their classic counterparts with a smaller number of layers. The PyTorch implementation code for Paon is open-sourced at https://github.com/onur-keles/Paon.

翻译：神经网络通常采用McCulloch-Pitts神经元模型，该模型由线性变换与逐点非线性激活函数构成。已有研究者提出了多种本质非线性的神经元模型，例如二次神经元、广义运算神经元、生成神经元和超神经元，这些模型相较于逐点激活函数具有更强的非线性表达能力。本文受帕德逼近理论启发，提出了一种新颖且更优的非线性神经元模型——帕德神经元（Paons）。该模型具有多重优势：首先，每个帕德神经元可学习输入的不同非线性函数，实现了非线性表达的多样性；其次，相较于分段线性逼近方法，帕德神经元能以更少的网络层数提供更强的非线性表达能力，显著提升层效率。此外，帕德神经元能够涵盖所有现有神经元模型作为其特例，因此任何网络中的神经元均可被帕德神经元替代。需要特别说明的是，现有研究曾提出将帕德逼近作为广义逐点激活函数使用，这与本文提出的神经元模型存在本质区别。为验证帕德神经元的有效性，我们在实验中基于ResNet架构，将若干知名神经图像超分辨率、压缩及分类模型中的经典神经元替换为帕德神经元。综合实验结果表明：采用帕德神经元构建的神经模型，在减少网络层数的前提下，能够取得优于或等同于经典模型的性能表现。帕德神经元的PyTorch实现代码已开源发布于https://github.com/onur-keles/Paon。