Deploying Machine learning (ML) on milliwatt-scale edge devices (tinyML) is gaining popularity due to recent breakthroughs in ML and Internet of Things (IoT). Most tinyML research focuses on model compression techniques that trade accuracy (and model capacity) for compact models to fit into the KB-sized tiny-edge devices. In this paper, we show how such models can be enhanced by the addition of an early exit intermediate classifier. If the intermediate classifier exhibits sufficient confidence in its prediction, the network exits early thereby, resulting in considerable savings in time. Although early exit classifiers have been proposed in previous work, these previous proposals focus on large networks, making their techniques suboptimal/impractical for tinyML applications. Our technique is optimized specifically for tiny-CNN sized models. In addition, we present a method to alleviate the effect of network overthinking by leveraging the representations learned by the early exit. We evaluate T-RecX on three CNNs from the MLPerf tiny benchmark suite for image classification, keyword spotting and visual wake word detection tasks. Our results show that T-RecX 1) improves the accuracy of baseline network, 2) achieves 31.58% average reduction in FLOPS in exchange for one percent accuracy across all evaluated models. Furthermore, we show that our methods consistently outperform popular prior works on the tiny-CNNs we evaluate.
翻译:在毫瓦级边缘设备上部署机器学习(tinyML)正因ML与物联网(IoT)的近期突破而日益流行。现有tinyML研究主要聚焦于模型压缩技术,通过牺牲精度(与模型容量)获取紧凑模型以适配KB级微小边缘设备。本文展示如何通过添加早期退出中间分类器来增强此类模型。若中间分类器对其预测具有足够置信度,网络将提前退出,从而显著节省时间。尽管前人工作已提出早期退出分类器,但现有方案均针对大型网络设计,导致其技术在tinyML应用中表现欠优/不切实际。我们的技术专门针对微小CNN规模模型进行优化。此外,我们提出一种方法,通过利用早期退出所学表示缓解网络过度思考效应。我们在MLPerf tiny基准套件中的三个CNN上评估T-RecX,涵盖图像分类、关键词唤醒与视觉唤醒词检测任务。结果表明T-RecX:1)提升基线网络精度;2)在所有评估模型中以1%精度为代价实现平均31.58%的FLOPS缩减。进一步证明,我们的方法在所评估的微小CNN上始终优于主流先前工作。