TND-NAS: Towards Non-differentiable Objectives in Progressive Differentiable NAS Framework

Differentiable architecture search has gradually become the mainstream research topic in the field of Neural Architecture Search (NAS) for its high efficiency compared with the early NAS methods. Recent differentiable NAS also aims at further improving the search performance and reducing the GPU-memory consumption. However, these methods are no longer naturally capable of tackling the non-differentiable objectives, e.g., energy, resource-constrained efficiency, and other metrics, let alone the multi-objective search demands. Researches in the multi-objective NAS field target this but requires vast computational resources cause of the sole optimization of each candidate architecture. In light of this discrepancy, we propose the TND-NAS, which is with the merits of the high efficiency in differentiable NAS framework and the compatibility among non-differentiable metrics in Multi-objective NAS. Under the differentiable NAS framework, with the continuous relaxation of the search space, TND-NAS has the architecture parameters been optimized in discrete space, while resorting to the progressive search space shrinking by architecture parameters. Our representative experiment takes two objectives (Parameters, Accuracy) as an example, we achieve a series of high-performance compact architectures on CIFAR10 (1.09M/3.3%, 2.4M/2.95%, 9.57M/2.54%) and CIFAR100 (2.46M/18.3%, 5.46/16.73%, 12.88/15.20%) datasets. Favorably, compared with other multi-objective NAS methods, TND-NAS is less time-consuming (1.3 GPU-days on NVIDIA 1080Ti, 1/6 of that in NSGA-Net), and can be conveniently adapted to real-world NAS scenarios (resource-constrained, platform-specialized).

翻译：可微分架构搜索因其相较于早期NAS方法的高效率，已逐渐成为神经架构搜索（NAS）领域的主流研究方向。近年来，可微分NAS方法致力于进一步提升搜索性能并降低GPU内存消耗。然而，这些方法天然无法处理不可微分目标，例如能耗、资源受限效率及其他评估指标，更遑论多目标搜索需求。多目标NAS领域的研究虽针对此问题，但由于需要独立优化每个候选架构而消耗大量计算资源。针对这一矛盾，我们提出TND-NAS方法，该方法兼具可微分NAS框架的高效性与多目标NAS中处理不可微分指标的兼容性。在可微分NAS框架下，通过搜索空间的连续松弛，TND-NAS在离散空间中优化架构参数，同时借助架构参数实现渐进式搜索空间收缩。代表性实验以两个目标（参数量、准确率）为例，我们在CIFAR10（1.09M/3.3%，2.4M/2.95%，9.57M/2.54%）和CIFAR100（2.46M/18.3%，5.46M/16.73%，12.88M/15.20%）数据集上获得了一系列高性能紧凑架构。值得注意的是，与其他多目标NAS方法相比，TND-NAS耗时更少（在NVIDIA 1080Ti上仅需1.3 GPU天，为NSGA-Net的1/6），且能便捷地适配实际NAS应用场景（资源受限、平台专用化）。