Memorization with neural nets: going beyond the worst case

In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite dataset with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the two classes and their mutual arrangement. As a result, we obtain guarantees that are independent of the number of samples and hence move beyond worst-case memorization capacity bounds. We illustrate the effectiveness of the algorithm in non-pathological situations with extensive numerical experiments and link the insights back to the theoretical results.

翻译：实践中，深度神经网络往往能够轻松插值其训练数据。为理解这一现象，大量研究致力于量化神经网络架构的记忆容量：即网络架构能够插值任意标注下任意数据点配置的最大数据点数量。然而对于现实数据，人们直观预期存在良性结构，使得网络尺寸远小于记忆容量所暗示时即可实现插值。本文从实例特定视角研究插值问题。我们提出一种简单随机算法，对于任意固定两类有限数据集，该算法能以高概率在多项式时间内构建插值型三层神经网络。所需参数数量与两个类别的几何特性及其相互排列相关。由此获得的保证与样本数量无关，从而超越了最坏情况下的记忆容量界限。我们通过大量数值实验展示了该算法在非病态场景中的有效性，并将相关见解回归理论结果。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

专知会员服务

11+阅读 · 2022年9月12日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日