Neural architecture search (NAS) proves to be among the effective approaches for many tasks by generating an application-adaptive neural architecture, which is still challenged by high computational cost and memory consumption. At the same time, 1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices. One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS by taking advantage of the strengths of each in a unified framework, while searching the 1-bit CNNs is more challenging due to the more complicated processes involved. In this paper, we introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs, based on a new framework of searching the 1-bit model (Child) under the supervision of a real-valued model (Parent). Particularly, we first utilize a Parent model to calculate a tangent direction, based on which the tangent propagation method is introduced to search the optimized 1-bit Child. We further observe a coupling relationship between the weights and architecture parameters existing in such differentiable frameworks. To address the issue, we propose a decoupled optimization method to search an optimized architecture. Extensive experiments demonstrate that our DCP-NAS achieves much better results than prior arts on both CIFAR-10 and ImageNet datasets. In particular, the backbones achieved by our DCP-NAS achieve strong generalization performance on person re-identification and object detection.
翻译:神经架构搜索(NAS)被证明是通过生成自适应神经架构来解决许多任务的有效方法之一,但其仍面临高计算成本和内存消耗的挑战。与此同时,具有二进制权重和激活值的1位卷积神经网络(CNN)展示了其在资源受限嵌入式设备上的潜力。一种自然的思路是利用1位CNN的优势,在一个统一框架中通过各自的优势降低NAS的计算与内存成本,但由于涉及更为复杂的过程,搜索1位CNN更具挑战性。本文提出了差异亲子神经架构搜索(DCP-NAS),以高效搜索1位CNN,该方法基于一个新型框架:在实值模型(父代)的监督下搜索1位模型(子代)。具体而言,我们首先利用父代模型计算切向方向,并基于此引入切向传播方法来搜索优化后的1位子代模型。我们进一步观察到,在这种可微分框架中,权重与架构参数之间存在耦合关系。为解决该问题,我们提出了一种解耦优化方法来搜索优化架构。大量实验表明,我们的DCP-NAS在CIFAR-10和ImageNet数据集上均取得了优于现有方法的结果。特别地,由DCP-NAS生成的骨干网络在行人重识别与目标检测任务上展现出强大的泛化性能。