Existing NAS methods suffer from either an excessive amount of time for repetitive sampling and training of many task-irrelevant architectures. To tackle such limitations of existing NAS methods, we propose a paradigm shift from NAS to a novel conditional Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG. Specifically, we consider the neural architectures as directed graphs and propose a graph diffusion model for generating them. Moreover, with the guidance of parameterized predictors, DiffusionNAG can flexibly generate task-optimal architectures with the desired properties for diverse tasks, by sampling from a region that is more likely to satisfy the properties. This conditional NAG scheme is significantly more efficient than previous NAS schemes which sample the architectures and filter them using the property predictors. We validate the effectiveness of DiffusionNAG through extensive experiments in two predictor-based NAS scenarios: Transferable NAS and Bayesian Optimization (BO)-based NAS. DiffusionNAG achieves superior performance with speedups of up to 35 times when compared to the baselines on Transferable NAS benchmarks. Furthermore, when integrated into a BO-based algorithm, DiffusionNAG outperforms existing BO-based NAS approaches, particularly in the large MobileNetV3 search space on the ImageNet 1K dataset. Code is available at https://github.com/CownowAn/DiffusionNAG.
翻译:现有神经架构搜索方法存在一个共同缺陷,即需要重复采样和训练大量与任务无关的架构,导致时间开销过高。为克服这一局限,我们提出一种从NAS到新型条件式神经架构生成框架的范式转变——DiffusionNAG,该方法基于扩散模型实现。具体而言,我们将神经架构视为有向图,并为此设计了一种图扩散生成模型。通过参数化预测器的引导,DiffusionNAG能从更可能满足任务特性的采样区域中生成具备所需属性的任务最优架构,从而灵活适配多样化任务。这种条件式NAG方案显著优于以往需要先采样架构、再通过属性预测器进行筛选的NAS方案。我们在两种基于预测器的NAS场景(可迁移NAS和基于贝叶斯优化的NAS)中通过大量实验验证了DiffusionNAG的有效性。在可迁移NAS基准测试中,DiffusionNAG相比基线方法实现了高达35倍的加速,同时性能更优。此外,当集成至基于BO的算法时,DiffusionNAG在ImageNet 1K数据集的大规模MobileNetV3搜索空间中超越了现有BO-based NAS方法。代码已开源:https://github.com/CownowAn/DiffusionNAG。