In this paper, we investigate computational power of threshold circuits and other theoretical models of neural networks in terms of the following four complexity measures: size (the number of gates), depth, weight and energy. Here the energy complexity of a circuit measures sparsity of their computation, and is defined as the maximum number of gates outputting non-zero values taken over all the input assignments. As our main result, we prove that any threshold circuit $C$ of size $s$, depth $d$, energy $e$ and weight $w$ satisfies $\log (rk(M_C)) \le ed (\log s + \log w + \log n)$, where $rk(M_C)$ is the rank of the communication matrix $M_C$ of a $2n$-variable Boolean function that $C$ computes. Thus, such a threshold circuit $C$ is able to compute only a Boolean function of which communication matrix has rank bounded by a product of logarithmic factors of $s,w$ and linear factors of $d,e$. This implies an exponential lower bound on the size of even sublinear-depth threshold circuit if energy and weight are sufficiently small. For other models of neural networks such as a discretized ReLE circuits and decretized sigmoid circuits, we prove that a similar inequality also holds for a discretized circuit $C$: $rk(M_C) = O(ed(\log s + \log w + \log n)^3)$.
翻译:本文从四个复杂度度量——规模(门数量)、深度、权重和能量——出发,研究了阈值电路及其他神经网络理论模型的计算能力。其中,电路的能量复杂度衡量其计算的稀疏性,定义为在所有输入赋值下输出非零值的门数量的最大值。作为主要结果,我们证明任意规模为$s$、深度为$d$、能量为$e$、权重为$w$的阈值电路$C$满足$\log (rk(M_C)) \le ed (\log s + \log w + \log n)$,其中$rk(M_C)$是$C$所计算的$2n$变量布尔函数的通信矩阵$M_C$的秩。因此,这样的阈值电路$C$仅能计算通信矩阵秩被$s$、$w$的对数因子与$d$、$e$的线性因子乘积所界定的布尔函数。这意味着当能量和权重足够小时,即使亚线性深度的阈值电路也存在规模上的指数下界。对于其他神经网络模型(如离散化ReLU电路和离散化sigmoid电路),我们证明离散化电路$C$也满足类似不等式:$rk(M_C) = O(ed(\log s + \log w + \log n)^3)$。