Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in Multi-Objective Neural Architecture Search

Deep learning is increasingly impacting various aspects of contemporary society. Artificial neural networks have emerged as the dominant models for solving an expanding range of tasks. The introduction of Neural Architecture Search (NAS) techniques, which enable the automatic design of task-optimal networks, has led to remarkable advances. However, the NAS process is typically associated with long execution times and significant computational resource requirements. Once-For-All (OFA) and its successor, Once-For-All-2 (OFAv2), have been developed to mitigate these challenges. While maintaining exceptional performance and eliminating the need for retraining, they aim to build a single super-network model capable of directly extracting sub-networks satisfying different constraints. Neural Architecture Transfer (NAT) was developed to maximise the effectiveness of extracting sub-networks from a super-network. In this paper, we present NATv2, an extension of NAT that improves multi-objective search algorithms applied to dynamic super-network architectures. NATv2 achieves qualitative improvements in the extractable sub-networks by exploiting the improved super-networks generated by OFAv2 and incorporating new policies for initialisation, pre-processing and updating its networks archive. In addition, a post-processing pipeline based on fine-tuning is introduced. Experimental results show that NATv2 successfully improves NAT and is highly recommended for investigating high-performance architectures with a minimal number of parameters.

翻译：深度学习正日益影响着当代社会的方方面面。人工神经网络已成为解决日益增多任务的主导模型。神经架构搜索（NAS）技术的引入，使得自动设计任务最优网络成为可能，并取得了显著进展。然而，NAS过程通常伴随着较长的执行时间和巨大的计算资源需求。一次性模型（OFA）及其后继版本OFAv2被开发用于缓解这些挑战。在保持卓越性能且无需重新训练的同时，它们旨在构建一个能够直接提取满足不同约束子网络的单一超网络模型。神经架构迁移（NAT）旨在最大化从超网络中提取子网络的效率。本文提出了NATv2，这是NAT的扩展版本，改进了应用于动态超网络架构的多目标搜索算法。NATv2通过利用OFAv2生成的改进超网络，并引入初始化、预处理和网络存档更新的新策略，实现了可提取子网络的质的提升。此外，还引入了基于微调的后处理流程。实验结果表明，NATv2成功改进了NAT，并高度推荐用于研究参数数量极少的高性能架构。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日