Bayesian Optimization Mixed-Precision Neural Architecture Search (BOMP-NAS) is an approach to quantization-aware neural architecture search (QA-NAS) that leverages both Bayesian optimization (BO) and mixed-precision quantization (MP) to efficiently search for compact, high performance deep neural networks. The results show that integrating quantization-aware fine-tuning (QAFT) into the NAS loop is a necessary step to find networks that perform well under low-precision quantization: integrating it allows a model size reduction of nearly 50\% on the CIFAR-10 dataset. BOMP-NAS is able to find neural networks that achieve state of the art performance at much lower design costs. This study shows that BOMP-NAS can find these neural networks at a 6x shorter search time compared to the closest related work.
翻译:贝叶斯优化混合精度神经架构搜索(BOMP-NAS)是一种面向量化的神经架构搜索(QA-NAS)方法,它结合了贝叶斯优化(BO)与混合精度量化(MP),从而高效地搜索出紧凑且高性能的深度神经网络。结果表明,将量化感知微调(QAFT)集成到NAS循环中是寻找在低精度量化下表现良好的网络的必要步骤:在CIFAR-10数据集上,该集成使得模型尺寸减少近50%。BOMP-NAS能够以远低于同类方法的设计成本,找到达到最先进性能的神经网络。本研究表明,与最相近的相关工作相比,BOMP-NAS的搜索时间缩短了6倍。