Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional parameters are needed instead of $n^2$. A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just $ n $ additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of $C$ bounded clusters can be separated with only a single layer of $C$ quadratic neurons.
翻译:过去曾有人研究高阶人工神经元,其输出通过将激活函数应用于输入的高阶多项式函数来计算,但由于额外参数和计算成本过高而未被广泛接受。然而,高阶神经元具有显著更强的学习能力,因为其决策边界可以是复杂曲面,而非仅仅是超平面。单个二次神经元的边界可以是通用的超二次曲面,使其能够学习许多非线性可分的数据集。由于二次型可由对称矩阵表示,因此仅需$\frac{n(n+1)}{2}$个额外参数,而非$n^2$个。首先提出了二次逻辑回归模型,并探讨了使用单个二次神经元解决XOR问题的方法。推导了由二次神经元组成的前馈网络的前向和反向传播的完整向量化方程。提出了一种每神经元仅需$n$个额外参数的简化参数二次神经网络模型,该模型在学习能力与计算成本之间取得了平衡。通过对基准分类数据集的比较,证明了使用二次神经元作为最终层可以使网络在显著减少隐藏层神经元数量的同时实现更高精度。特别地,本文表明,任何由$C$个有界簇组成的数据集,仅需单层$C$个二次神经元即可分离。