We show that feedforward neural networks with ReLU activation generalize on low complexity data, suitably defined. Given i.i.d.~data generated from a simple programming language, the minimum description length (MDL) feedforward neural network which interpolates the data generalizes with high probability. We define this simple programming language, along with a notion of description length of such networks. We provide several examples on basic computational tasks, such as checking primality of a natural number. For primality testing, our theorem shows the following and more. Suppose that we draw an i.i.d.~sample of $n$ numbers uniformly at random from $1$ to $N$. For each number $x_i$, let $y_i = 1$ if $x_i$ is a prime and $0$ if it is not. Then, the interpolating MDL network accurately answers, with probability $1- O((\ln N)/n)$, whether a newly drawn number between $1$ and $N$ is a prime or not. Note that the network is not designed to detect primes; minimum description learning discovers a network which does so. Extensions to noisy data are also discussed, suggesting that MDL neural network interpolators can demonstrate tempered overfitting.
翻译:我们证明了具有ReLU激活函数的前馈神经网络能够在适当定义的低复杂度数据上实现泛化。给定从一种简单编程语言生成的独立同分布数据,能够插值该数据的最小描述长度(MDL)前馈神经网络以高概率实现泛化。我们定义了这种简单编程语言,并建立了此类网络的描述长度概念。我们通过基础计算任务(例如判断自然数的素数性)提供了多个示例。对于素数检测,我们的定理展示了以下(且更广泛的)结果:假设我们从1到$N$的范围内均匀随机抽取$n$个独立同分布的数字样本。对于每个数字$x_i$,若$x_i$为素数则令$y_i = 1$,否则为$0$。此时,插值MDL网络能够以$1- O((\ln N)/n)$的概率准确判断新抽取的(1到$N$范围内的)数字是否为素数。值得注意的是,该网络并非专门为检测素数而设计;最小描述长度学习自动发现了具备此功能的网络。本文还讨论了向含噪声数据的扩展,表明MDL神经网络插值器能够表现出温和的过拟合现象。