Activation Functions Not To Active: A Plausible Theory on Interpreting Neural Networks

Researchers commonly believe that neural networks model a high-dimensional space but cannot give a clear definition of this space. What is this space? What is its dimension? And does it has finite dimensions? In this paper, we develop a plausible theory on interpreting neural networks in terms of the role of activation functions in neural networks and define a high-dimensional (more precisely, an infinite-dimensional) space that neural networks including deep-learning networks could create. We show that the activation function acts as a magnifying function that maps the low-dimensional linear space into an infinite-dimensional space, which can distinctly identify the polynomial approximation of any multivariate continuous function of the variable values being the same features of the given dataset. Given a dataset with each example of $d$ features $f_1$, $f_2$, $\cdots$, $f_d$, we believe that neural networks model a special space with infinite dimensions, each of which is a monomial $$\prod_{i_1, i_2, \cdots, i_d} f_1^{i_1} f_2^{i_2} \cdots f_d^{i_d}$$ for some non-negative integers ${i_1, i_2, \cdots, i_d} \in \mathbb{Z}_{0}^{+}=\{0,1,2,3,\ldots\} $. We term such an infinite-dimensional space a $\textit{ Super Space (SS)}$. We see such a dimension as the minimum information unit. Every neuron node previously through an activation layer in neural networks is a $\textit{ Super Plane (SP) }$, which is actually a polynomial of infinite degree. This $\textit{ Super Space }$ is something like a coordinate system, in which every multivalue function can be represented by a $\textit{ Super Plane }$. We also show that training NNs could at least be reduced to solving a system of nonlinear equations. %solve sets of nonlinear equations

翻译：研究人员普遍认为神经网络建模了一个高维空间，但未能明确定义该空间。这个空间是什么？其维度为何？是否具有有限维度？本文基于激活函数在神经网络中的作用，提出了一种解释神经网络的合理理论，并定义了神经网络（包括深度学习网络）可能创建的高维空间（更精确地是无限维空间）。我们证明激活函数充当放大函数，将低维线性空间映射至无限维空间，从而能清晰识别任意多元连续函数的多项式逼近，其中变量的值与给定数据集的相同特征对应。给定一个数据集，每个样本包含$d$个特征$f_1, f_2, \cdots, f_d$，我们认为神经网络建模了一个具有无限维度的特殊空间，每个维度对应于某个非负整数集合$\{i_1, i_2, \cdots, i_d\} \in \mathbb{Z}_{0}^{+}=\{0,1,2,3,\ldots\} $的单项式$$\prod_{i_1, i_2, \cdots, i_d} f_1^{i_1} f_2^{i_2} \cdots f_d^{i_d}$$。我们将这种无限维空间称为$\textit{超空间（Super Space, SS）}$。我们将每个维度视为最小信息单元。神经网络中每个经过激活层的神经元节点是一个$\textit{超平面（Super Plane, SP）}$，它实际上是无限次多项式。这种$\textit{超空间}$类似于坐标系，其中每个多值函数可由一个$\textit{超平面}$表示。我们还证明，训练神经网络至少可简化为求解非线性方程组。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【上海交通大学-张拳石】可解释CNN，Interpretable CNNs for Object Classification

专知会员服务

46+阅读 · 2020年3月13日