Intelligent edge vision tasks encounter the critical challenge of ensuring power and latency efficiency due to the typically heavy computational load they impose on edge platforms.This work leverages one of the first "AI in sensor" vision platforms, IMX500 by Sony, to achieve ultra-fast and ultra-low-power end-to-end edge vision applications. We evaluate the IMX500 and compare it to other edge platforms, such as the Google Coral Dev Micro and Sony Spresense, by exploring gaze estimation as a case study. We propose TinyTracker, a highly efficient, fully quantized model for 2D gaze estimation designed to maximize the performance of the edge vision systems considered in this study. TinyTracker achieves a 41x size reduction (600Kb) compared to iTracker [1] without significant loss in gaze estimation accuracy (maximum of 0.16 cm when fully quantized). TinyTracker's deployment on the Sony IMX500 vision sensor results in end-to-end latency of around 19ms. The camera takes around 17.9ms to read, process and transmit the pixels to the accelerator. The inference time of the network is 0.86ms with an additional 0.24 ms for retrieving the results from the sensor. The overall energy consumption of the end-to-end system is 4.9 mJ, including 0.06 mJ for inference. The end-to-end study shows that IMX500 is 1.7x faster than CoralMicro (19ms vs 34.4ms) and 7x more power efficient (4.9mJ VS 34.2mJ)
翻译:智能边缘视觉任务面临的关键挑战在于确保边缘平台的功耗与延迟效率,这是由于它们通常需要承受沉重的计算负载。本工作利用了首个“传感器内AI”视觉平台——索尼IMX500,实现了超快速且超低功耗的端到端边缘视觉应用。我们以凝视估计为案例研究,对IMX500进行评估,并将其与其他边缘平台(如谷歌Coral Dev Micro和索尼Spresense)进行比较。我们提出了TinyTracker,一种专为最大化本研究中边缘视觉系统性能而设计的高效、全量化2D凝视估计模型。与iTracker [1] 相比,TinyTracker实现了41倍的尺寸缩减(600Kb),且未显著损失凝视估计精度(全量化时最大误差为0.16厘米)。将TinyTracker部署在索尼IMX500视觉传感器上,可实现约19毫秒的端到端延迟。摄像头读取、处理并将像素传输至加速器大约需要17.9毫秒。网络推理时间为0.86毫秒,另需0.24毫秒从传感器检索结果。端到端系统的总体能耗为4.9毫焦,其中推理部分仅占0.06毫焦。端到端研究表明,IMX500比Coral Micro快1.7倍(19毫秒对比34.4毫秒),且能效高出7倍(4.9毫焦对比34.2毫焦)。