On Expressivity and Trainability of Quadratic Networks

Inspired by the diversity of biological neurons, quadratic artificial neurons can play an important role in deep learning models. The type of quadratic neurons of our interest replaces the inner-product operation in the conventional neuron with a quadratic function. Despite promising results so far achieved by networks of quadratic neurons, there are important issues not well addressed. Theoretically, the superior expressivity of a quadratic network over either a conventional network or a conventional network via quadratic activation is not fully elucidated, which makes the use of quadratic networks not well grounded. Practically, although a quadratic network can be trained via generic backpropagation, it can be subject to a higher risk of collapse than the conventional counterpart. To address these issues, we first apply the spline theory and a measure from algebraic geometry to give two theorems that demonstrate better model expressivity of a quadratic network than the conventional counterpart with or without quadratic activation. Then, we propose an effective training strategy referred to as ReLinear to stabilize the training process of a quadratic network, thereby unleashing the full potential in its associated machine learning tasks. Comprehensive experiments on popular datasets are performed to support our findings and confirm the performance of quadratic deep learning. We have shared our code in \url{https://github.com/FengleiFan/ReLinear}.

翻译：受生物神经元多样性的启发，二次人工神经元可在深度学习模型中发挥重要作用。本文关注的二次神经元类型将传统神经元中的内积运算替换为二次函数。尽管二次神经元网络已取得令人鼓舞的成果，但仍存在若干重要问题尚未得到充分解决。理论上，二次网络相对于传统网络或采用二次激活的传统网络在表达能力上的优越性尚未得到完全阐明，这使得二次网络的使用缺乏充分的理论依据。实践上，虽然二次网络可通过通用反向传播算法进行训练，但其崩溃风险高于传统网络。为应对这些问题，我们首先应用样条理论与代数几何中的测度，提出两个定理证明二次网络比带或不带二次激活的传统网络具有更优的模型表达能力。随后，我们提出一种称为ReLinear的有效训练策略，以稳定二次网络的训练过程，从而释放其在相关机器学习任务中的全部潜力。在主流数据集上的全面实验验证了我们的发现，并确认了二次深度学习的性能。代码已开源至\url{https://github.com/FengleiFan/ReLinear}。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日