Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional parameters are needed instead of $n^2$. A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just $ n $ additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of $\mathcal{C}$ bounded clusters can be separated with only a single layer of $\mathcal{C}$ quadratic neurons.
翻译:过去已有研究考虑高阶人工神经元,其输出通过对输入的高阶多项式函数应用激活函数计算得出,但由于额外参数和计算成本未获广泛接受。然而,高阶神经元具有显著更强的学习能力,因其决策边界可为复杂曲面而非单纯超平面。单个二次神经元的边界可为一般超二次曲面,使其能够学习许多非线性可分数据集。由于二次型可用对称矩阵表示,仅需$\frac{n(n+1)}{2}$个额外参数,而非$n^2$个。本文首先提出二次逻辑回归模型,并探讨单个二次神经元解决异或(XOR)问题的方案。推导了由二次神经元构成的前馈网络中前向传播与反向传播的完整向量化方程。提出一种参数缩减的二次神经网络模型,每个神经元仅增加$n$个额外参数,在学习能力与计算成本之间取得平衡。通过基准分类数据集对比表明,使用二次神经元的最终层可使网络在显著减少隐藏层神经元数量的同时实现更高准确率。特别地,本文证明任何由$\mathcal{C}$个有界簇构成的数据集均可仅用单层$\mathcal{C}$个二次神经元实现分离。