Deploying Deep Neural Networks (DNNs) on tiny devices is a common trend to process the increasing amount of sensor data being generated. Multi-objective optimization approaches can be used to compress DNNs by applying network pruning and weight quantization to minimize the memory footprint (RAM), the number of parameters (ROM) and the number of floating point operations (FLOPs) while maintaining the predictive accuracy. In this paper, we show that existing multi-objective Bayesian optimization (MOBOpt) approaches can fall short in finding optimal candidates on the Pareto front and propose a novel solver based on an ensemble of competing parametric policies trained using an Augmented Random Search Reinforcement Learning (RL) agent. Our methodology aims at finding feasible tradeoffs between a DNN's predictive accuracy, memory consumption on a given target system, and computational complexity. Our experiments show that we outperform existing MOBOpt approaches consistently on different data sets and architectures such as ResNet-18 and MobileNetV3.
翻译:在微型设备上部署深度神经网络(DNN)是处理日益增长的传感器数据量的普遍趋势。多目标优化方法可通过网络剪枝和权重量化压缩DNN,以最小化内存占用(RAM)、参数量(ROM)和浮点运算次数(FLOPs),同时保持预测精度。本文表明,现有基于多目标贝叶斯优化(MOBOpt)的方法在寻找帕累托前沿最优候选解时可能存在不足,为此我们提出一种新型求解器,该求解器基于使用增强型随机搜索强化学习(RL)智能体训练的竞争性参数化策略集成。我们的方法旨在在DNN的预测精度、给定目标系统上的内存消耗与计算复杂度之间寻求可行的权衡方案。实验表明,在ResNet-18和MobileNetV3等不同数据集与架构上,我们的方法持续优于现有MOBOpt方法。