Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional parameters are needed instead of $n^2$. A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just $ n $ additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of $\mathcal{C}$ bounded clusters can be separated with only a single layer of $\mathcal{C}$ quadratic neurons.
翻译:高阶人工神经元通过将激活函数应用于输入的高阶多项式函数来计算输出,这种方法在过去曾被考虑,但由于额外的参数和计算成本而未被广泛接受。然而,高阶神经元具有显著更强的学习能力,因为高阶神经元的决策边界可以是复杂的曲面,而不仅仅是超平面。单个二次神经元的边界可以是一个一般的超二次曲面,使其能够学习许多非线性可分的数据集。由于二次型可以用对称矩阵表示,因此只需要 $\frac{n(n+1)}{2}$ 个额外参数,而不是 $n^2$ 个。本文首先提出了一个二次逻辑回归模型。考虑了使用单个二次神经元解决 XOR 问题的方案。推导了由二次神经元组成的前馈网络中前向传播和反向传播的完整向量化方程。提出了一种仅需每个神经元增加 $ n $ 个参数的简化参数二次神经网络模型,该模型在学习能力和计算成本之间提供了折衷。通过在基准分类数据集上的比较,证明了使用一层二次神经元作为最终层可以使网络以显著更少的隐藏层神经元实现更高的准确率。特别地,本文表明任何由 $\mathcal{C}$ 个有界簇组成的数据集都可以仅用一层 $\mathcal{C}$ 个二次神经元进行分离。