Efficient Multi-objective Neural Architecture Search Framework via Policy Gradient Algorithm

Differentiable architecture search has gradually become the mainstream research topic in the field of Neural Architecture Search (NAS) for its high efficiency compared with the early NAS (EA-based, RL-based) methods. Recent differentiable NAS also aims at further improving the search performance and reducing the GPU-memory consumption. However, these methods are no longer naturally capable of tackling the non-differentiable objectives, e.g., energy, resource-constrained efficiency, and other metrics, let alone the multi-objective search demands. Researches in the multi-objective NAS field target this but requires vast computational resources cause of the sole optimization of each candidate architecture. In light of this discrepancy, we propose the TND-NAS, which is with the merits of the high efficiency in differentiable NAS framework and the compatibility among non-differentiable metrics in Multi-objective NAS. Under the differentiable NAS framework, with the continuous relaxation of the search space, TND-NAS has the architecture parameters been optimized in discrete space, while resorting to the progressive search space shrinking by architecture parameters. Our representative experiment takes two objectives (Parameters, Accuracy) as an example, we achieve a series of high-performance compact architectures on CIFAR10 (1.09M/3.3%, 2.4M/2.95%, 9.57M/2.54%) and CIFAR100 (2.46M/18.3%, 5.46/16.73%, 12.88/15.20%) datasets. Favorably, compared with other multi-objective NAS methods, TND-NAS is less time-consuming (1.3 GPU-days on NVIDIA 1080Ti, 1/6 of that in NSGA-Net), and can be conveniently adapted to real-world NAS scenarios (resource-constrained, platform-specialized).

翻译：可微分架构搜索因相较于早期NAS（基于进化算法、强化学习的方法）具有高效性，已逐渐成为神经架构搜索领域的主流研究方向。近年来的可微分NAS方法致力于进一步提升搜索性能并降低GPU内存消耗。然而，这些方法无法天然处理能量、资源受限效率等非可微目标，更难以满足多目标搜索需求。多目标NAS领域的相关研究虽能解决此问题，但因需对每个候选架构进行独立优化而需要大量计算资源。针对这一矛盾，我们提出TND-NAS方法，该方法兼具可微分NAS框架的高效性与多目标NAS中非可微指标的兼容性。在可微分NAS框架下，通过搜索空间的连续松弛，TND-NAS在离散空间中对架构参数进行优化，同时借助架构参数实现渐进式搜索空间缩减。以代表性实验中的两个目标（参数量、准确率）为例，我们在CIFAR10数据集（1.09M/3.3%、2.4M/2.95%、9.57M/2.54%）和CIFAR100数据集（2.46M/18.3%、5.46M/16.73%、12.88M/15.20%）上获得了一系列高性能紧凑架构。相较于其他多目标NAS方法，TND-NAS在时间消耗方面更具优势（在NVIDIA 1080Ti上仅需1.3 GPU天，为NSGA-Net的1/6），并能便捷地适配真实NAS场景（资源受限、平台专用化）。