Edge computing aims to enable edge devices, such as IoT devices, to process data locally instead of relying on the cloud. However, deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive. Creating manual architectures specialized for each device is infeasible due to their varying memory and computational constraints. To address these concerns, we automate the construction of task-specific deep learning architectures optimized for device constraints through Neural Architecture Search (NAS). We present DCA-NAS, a principled method of fast neural network architecture search that incorporates edge-device constraints such as model size and floating-point operations. It incorporates weight sharing and channel bottleneck techniques to speed up the search time. Based on our experiments, we see that DCA-NAS outperforms manual architectures for similar sized models and is comparable to popular mobile architectures on various image classification datasets like CIFAR-10, CIFAR-100, and Imagenet-1k. Experiments with search spaces -- DARTS and NAS-Bench-201 show the generalization capabilities of DCA-NAS. On further evaluating our approach on Hardware-NAS-Bench, device-specific architectures with low inference latency and state-of-the-art performance were discovered.
翻译:边缘计算旨在使物联网设备等边缘设备能够在本地处理数据,而非依赖云端。然而,计算机视觉和自然语言处理等深度学习技术在计算和内存方面可能代价高昂。由于不同设备的存储和计算约束各不相同,为每类设备手工设计专用架构不可行。为解决这些问题,我们通过神经架构搜索(NAS)自动构建针对设备约束优化的任务特定深度学习架构。我们提出DCA-NAS,一种融合模型大小和浮点运算次数等边缘设备约束的快速神经网络架构搜索原则性方法。该方法采用权重共享和通道瓶颈技术以加速搜索过程。实验表明,在相似模型规模下,DCA-NAS性能优于手工设计的架构,并与CIFAR-10、CIFAR-100和Imagenet-1k等多种图像分类数据集上流行的移动端架构性能相当。在DARTS和NAS-Bench-201搜索空间上的实验证明了DCA-NAS的泛化能力。进一步在Hardware-NAS-Bench上的评估显示,该方法发现了具有低推理延迟和先进性能的设备特定架构。