We introduce a unified theoretical framework for the rigorous analysis and systematic construction of deep neural networks (DNNs). This framework addresses a gap in existing theory by explicitly modeling the structure of tensor operations -- lower level information that is often abstracted. Our framework enables two novel objectives: (1) analysis of the evolution of architectural complexity over deep learning history, and (2) automatic construction of novel architectures based on new types of tensor operations. Our study of DNNs introduced over the past 40 years reveals a connection between groundbreaking architectures and increases in different types of architectural complexity. Moreover, we identify several large classes of higher complexity architectures that have not yet been explored. We then collect a dataset of 3,000+ higher complexity architectures, which we publicly release at: https://github.com/combinatoriallabs/ArchitecturalComplexity.
翻译:我们引入了一个统一的理论框架,用于深度神经网络(DNN)的严谨分析与系统性构建。该框架通过显式建模张量运算的结构(常被抽象化的低级信息),填补了现有理论的空白。我们的框架实现了两个新颖目标:(1)分析深度学习史上架构复杂度的演化进程,以及(2)基于新型张量运算自动构建创新架构。我们对过去40年间提出的DNN进行研究,揭示了突破性架构与不同类别架构复杂度增长之间的关联。此外,我们识别出若干尚未被探索的更高复杂度架构大类。最后,我们收集了包含3000余种更高复杂度架构的数据集,并已公开于:https://github.com/combinatoriallabs/ArchitecturalComplexity。