Memorization With Neural Nets: Going Beyond the Worst Case

In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite data set with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the two classes and their mutual arrangement. As a result, we obtain guarantees that are independent of the number of samples and hence move beyond worst-case memorization capacity bounds. We verify our theoretical result with numerical experiments and additionally investigate the effectiveness of the algorithm on MNIST and CIFAR-10.

翻译：在实践中，深度神经网络通常能够轻松插值其训练数据。为理解这一现象，许多研究致力于量化神经网络架构的记忆容量：即该架构能够插值任意位置分布且任意标签分配的样本点数量的上限。然而对于现实世界数据，人们直观预期存在良性结构，使得插值在远小于记忆容量理论值所需的网络规模下即可实现。本文通过采用实例特定的视角研究插值问题。我们提出一种简单的随机算法，在给定固定有限二分类数据集的情况下，该算法能够以高概率在多项式时间内构建出可插值的三层神经网络。所需参数量与两个类别的几何特性及其相互空间分布密切相关。由此获得的保证与样本数量无关，从而突破了最坏情况记忆容量的理论界限。我们通过数值实验验证了理论结果，并进一步探究了该算法在MNIST和CIFAR-10数据集上的实际效能。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日