To search an optimal sub-network within a general deep neural network (DNN), existing neural architecture search (NAS) methods typically rely on handcrafting a search space beforehand. Such requirements make it challenging to extend them onto general scenarios without significant human expertise and manual intervention. To overcome the limitations, we propose Automated Search-Space Generation Neural Architecture Search (ASGNAS), perhaps the first automated system to train general DNNs that cover all candidate connections and operations and produce high-performing sub-networks in the one shot manner. Technologically, ASGNAS delivers three noticeable contributions to minimize human efforts: (i) automated search space generation for general DNNs; (ii) a Hierarchical Half-Space Projected Gradient (H2SPG) that leverages the hierarchy and dependency within generated search space to ensure the network validity during optimization, and reliably produces a solution with both high performance and hierarchical group sparsity; and (iii) automated sub-network construction upon the H2SPG solution. Numerically, we demonstrate the effectiveness of ASGNAS on a variety of general DNNs, including RegNet, StackedUnets, SuperResNet, and DARTS, over benchmark datasets such as CIFAR10, Fashion-MNIST, ImageNet, STL-10 , and SVNH. The sub-networks computed by ASGNAS achieve competitive even superior performance compared to the starting full DNNs and other state-of-the-arts. The library will be released at https://github.com/tianyic/only_train_once.
翻译:为了在通用深度神经网络中搜索最优子网络,现有神经架构搜索方法通常需要预先手工设计搜索空间。这种需求使得它们难以扩展到通用场景,且需要大量专业知识和人工干预。为克服这些限制,我们提出自动搜索空间生成神经架构搜索(ASGNAS)——这或许是首个能够训练覆盖所有候选连接和操作的通用深度神经网络,并以一次训练方式生成高性能子网络的自动化系统。技术层面,ASGNAS通过三项显著贡献最大限度减少人工干预:(i)通用深度神经网络的自动搜索空间生成;(ii)一种分层半空间投影梯度法(H2SPG),该方法利用生成搜索空间内的层次结构和依赖关系确保优化过程中网络的有效性,并可靠地生成兼具高性能与层次组稀疏性的解;(iii)基于H2SPG解的自动子网络构建。数值实验方面,我们在CIFAR10、Fashion-MNIST、ImageNet、STL-10和SVNH等基准数据集上,针对包括RegNet、StackedUnets、SuperResNet和DARTS在内的多种通用深度神经网络验证了ASGNAS的有效性。与原始完整深度网络及其他最先进方法相比,ASGNAS计算的子网络达到了具有竞争力甚至更优的性能。相关代码库将在https://github.com/tianyic/only_train_once 公开。