Neural Architecture Search (NAS) has emerged as a powerful technique for automating neural architecture design. However, existing NAS methods either require an excessive amount of time for repetitive training or sampling of many task-irrelevant architectures. Moreover, they lack generalization across different tasks and usually require searching for optimal architectures for each task from scratch without reusing the knowledge from the previous NAS tasks. To tackle such limitations of existing NAS methods, we propose a novel transferable task-guided Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG. With the guidance of a surrogate model, such as a performance predictor for a given task, our DiffusionNAG can generate task-optimal architectures for diverse tasks, including unseen tasks. DiffusionNAG is highly efficient as it generates task-optimal neural architectures by leveraging the prior knowledge obtained from the previous tasks and neural architecture distribution. Furthermore, we introduce a score network to ensure the generation of valid architectures represented as directed acyclic graphs, unlike existing graph generative models that focus on generating undirected graphs. Extensive experiments demonstrate that DiffusionNAG significantly outperforms the state-of-the-art transferable NAG model in architecture generation quality, as well as previous NAS methods on four computer vision datasets with largely reduced computational cost.
翻译:神经架构搜索(NAS)已成为自动化神经架构设计的强大技术。然而,现有NAS方法要么需要耗费大量时间进行重复训练,要么需要采样许多与任务无关的架构。此外,它们缺乏跨不同任务的泛化能力,通常需要为每个任务从头搜索最优架构,而无法复用先前NAS任务的知识。为解决现有NAS方法的这些局限性,我们提出了一种基于扩散模型的新型可迁移任务引导式神经架构生成(NAG)框架,命名为DiffusionNAG。在代理模型(如针对给定任务的性能预测器)的引导下,我们的DiffusionNAG能为包括未见任务在内的多样化任务生成任务最优架构。DiffusionNAG通过利用从先前任务和神经架构分布中获得的先验知识,实现了高效的任务最优神经架构生成。此外,我们引入了一个评分网络来确保生成以有向无环图表示的有效架构,这与现有专注于生成无向图的图生成模型不同。大量实验表明,DiffusionNAG在架构生成质量上显著优于最先进的可迁移NAG模型,并且在四个计算机视觉数据集上以大幅降低的计算成本超越了以往的NAS方法。