Modern applications, such as autonomous vehicles, require deploying deep learning algorithms on resource-constrained edge devices for real-time image and video processing. However, there is limited understanding of the efficiency and performance of various object detection models on these devices. In this paper, we evaluate state-of-the-art object detection models, including YOLOv8 (Nano, Small, Medium), EfficientDet Lite (Lite0, Lite1, Lite2), and SSD (SSD MobileNet V1, SSDLite MobileDet). We deployed these models on popular edge devices like the Raspberry Pi 3, 4, and 5 with/without TPU accelerators, and Jetson Orin Nano, collecting key performance metrics such as energy consumption, inference time, and Mean Average Precision (mAP). Our findings highlight that lower mAP models such as SSD MobileNet V1 are more energy-efficient and faster in inference, whereas higher mAP models like YOLOv8 Medium generally consume more energy and have slower inference, though with exceptions when accelerators like TPUs are used. Among the edge devices, Jetson Orin Nano stands out as the fastest and most energy-efficient option for request handling, despite having the highest idle energy consumption. These results emphasize the need to balance accuracy, speed, and energy efficiency when deploying deep learning models on edge devices, offering valuable guidance for practitioners and researchers selecting models and devices for their applications.
翻译:现代应用(如自动驾驶汽车)需要在资源受限的边缘设备上部署深度学习算法,以实现实时图像与视频处理。然而,目前对于各类目标检测模型在这些设备上的效率与性能认识有限。本文评估了最先进的目标检测模型,包括 YOLOv8(Nano、Small、Medium)、EfficientDet Lite(Lite0、Lite1、Lite2)以及 SSD(SSD MobileNet V1、SSDLite MobileDet)。我们将这些模型部署在流行的边缘设备上,如 Raspberry Pi 3、4、5(配备/不配备 TPU 加速器)以及 Jetson Orin Nano,并收集了关键性能指标,包括能耗、推理时间和平均精度均值(mAP)。我们的研究结果表明,较低 mAP 的模型(如 SSD MobileNet V1)在推理过程中能效更高、速度更快,而较高 mAP 的模型(如 YOLOv8 Medium)通常能耗更高、推理速度更慢,但在使用 TPU 等加速器时存在例外。在边缘设备中,Jetson Orin Nano 在请求处理方面表现出最快速度和最高能效,尽管其空闲能耗最高。这些结果强调了在边缘设备上部署深度学习模型时,需要在精度、速度和能效之间取得平衡,为从业者和研究人员选择适合其应用的模型与设备提供了重要参考。