Generalization in Neural Networks: A Broad Survey

This paper reviews concepts, modeling approaches, and recent findings along a spectrum of different levels of abstraction of neural network models including generalization across (1) Samples, (2) Distributions, (3) Domains, (4) Tasks, (5) Modalities, and (6) Scopes. Results on (1) sample generalization show that, in the case of ImageNet, nearly all the recent improvements reduced training error while overfitting stayed flat; with nearly all the training error eliminated, future progress will require a focus on reducing overfitting. Perspectives from statistics highlight how (2) distribution generalization can be viewed alternately as a change in sample weights or a change in the input-output relationship; thus, techniques that have been successful in domain generalization have the potential to be applied to difficult forms of sample or distribution generalization. Transfer learning approaches to (3) domain generalization are summarized, as are recent advances and the wealth of domain adaptation benchmark datasets available. Recent breakthroughs surveyed in (4) task generalization include few-shot meta-learning approaches and the BERT NLP engine, and recent (5) modality generalization studies are discussed that integrate image and text data and that apply a biologically-inspired network across olfactory, visual, and auditory modalities. Recent (6) scope generalization results are reviewed that embed knowledge graphs into deep NLP approaches. Additionally, concepts from neuroscience are discussed on the modular architecture of brains and the steps by which dopamine-driven conditioning leads to abstract thinking.

翻译：本文回顾了神经网络模型在不同抽象层次上的概念、建模方法及最新研究成果，涵盖（1）样本泛化、（2）分布泛化、（3）领域泛化、（4）任务泛化、（5）模态泛化及（6）范围泛化。关于（1）样本泛化的结果表明，在ImageNet数据集上，近期几乎所有改进均降低了训练误差，而过拟合程度保持稳定；当训练误差几乎被消除后，未来进展需聚焦于减少过拟合。统计学视角表明，（2）分布泛化可被交替视为样本权重的变化或输入-输出关系的改变；因此，在领域泛化中成功的技术有望应用于困难形式的样本或分布泛化。文中总结了迁移学习方法在（3）领域泛化中的应用，以及最新进展和丰富的领域自适应基准数据集。在（4）任务泛化方面，综述了近期突破，包括小样本元学习方法及BERT自然语言处理引擎；讨论了（5）模态泛化的最新研究，这些研究整合图像与文本数据，并将生物启发式网络应用于嗅觉、视觉及听觉模态。回顾了（6）范围泛化的最新成果，将知识图谱嵌入深度自然语言处理方法中。此外，探讨了神经科学中关于大脑模块化架构及多巴胺驱动的条件反射如何引导抽象思维的步骤。