Although deep learning models have taken on commercial and political relevance, key aspects of their training and operation remain poorly understood. This has sparked interest in science of deep learning projects, many of which require large amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, procedurally generated, low-memory, and low-compute alternative to classic deep learning benchmarks. Although the dimensionality of MNIST-1D is only 40 and its default training set size only 4000, MNIST-1D can be used to study inductive biases of different deep architectures, find lottery tickets, observe deep double descent, metalearn an activation function, and demonstrate guillotine regularization in self-supervised learning. All these experiments can be conducted on a GPU or often even on a CPU within minutes, allowing for fast prototyping, educational use cases, and cutting-edge research on a low budget.
翻译:尽管深度学习模型已具有商业和政治影响力,但其训练和运行的关键方面仍鲜为人知。这激发了人们对深度学习科学研究项目的兴趣,其中许多项目需要投入大量时间、资金和电力成本。但这类研究真的需要在如此大规模下进行吗?本文提出MNIST-1D——一种极简、程序化生成、低内存、低计算代价的经典深度学习基准替代方案。尽管MNIST-1D的维度仅为40,默认训练集大小仅为4000,但它可用于研究不同深度架构的归纳偏好、寻找彩票假设、观察深度双重下降现象、元学习激活函数,以及在自监督学习中演示断头台正则化。所有实验均可在GPU甚至CPU上数分钟内完成,支持快速原型开发、教学场景和低预算下的尖端研究。