Despite the remarkable capabilities of deep neural networks in image recognition, the dependence on activation functions remains a largely unexplored area and has yet to be eliminated. On the other hand, Polynomial Networks is a class of models that does not require activation functions, but have yet to perform on par with modern architectures. In this work, we aim close this gap and propose MONet, which relies solely on multilinear operators. The core layer of MONet, called Mu-Layer, captures multiplicative interactions of the elements of the input token. MONet captures high-degree interactions of the input elements and we demonstrate the efficacy of our approach on a series of image recognition and scientific computing benchmarks. The proposed model outperforms prior polynomial networks and performs on par with modern architectures. We believe that MONet can inspire further research on models that use entirely multilinear operations.
翻译:尽管深度神经网络在图像识别中展现出卓越能力,但对激活函数的依赖仍是一个尚未充分探索且尚未消除的领域。另一方面,多项式网络作为一类无需激活函数的模型,至今仍难以达到与现代架构相当的性能。本研究旨在弥合这一差距,提出仅依赖多线性算子的MONet模型。其核心层——Mu-Layer——通过捕捉输入标记元素间的乘法交互作用,实现了对输入元素高阶交互关系的建模。我们通过一系列图像识别与科学计算基准测试验证了该方法的有效性。所提模型不仅优于以往多项式网络,更与现代主流架构性能相当。我们相信MONet能够激发对基于纯多线性运算模型的进一步研究。