Continuous AI inference on resource-constrained edge hardware introduces deployment effects that are largely invisible to conventional benchmark evaluation, including temporal instability in streaming video, thermal throttling under sustained load, and workload-dependent performance variability. We present Edge-TSR, a deployment-oriented continuous edge inference system for sustained roadside perception on the NVIDIA Jetson Orin Nano. Edge-TSR integrates detection, tracking, fine-grained classification, and a lightweight track-aware temporal stabilization mechanism that improves streaming inference consistency with negligible computational overhead. Our central finding is that benchmark-centric evaluation systematically overstates deployed edge inference performance. Across three state-of-the-art baselines, we observe consistent 20-30% relative degradation when transitioning from static-image evaluation to real-world streaming deployment. Edge-TSR addresses this gap through temporal inference stabilization, recovering up to 10.16% classification accuracy over per-frame inference baselines while maintaining sustained real-time performance under continuous operation. We evaluate the complete system under diverse real-world deployment conditions, jointly characterizing inference quality, latency, throughput, and thermal behavior during long-duration operation. A 55-minute vehicular deployment over a 26 km route demonstrates sustained operation at 16.18 FPS within safe thermal limits on a single embedded device without cloud offload. Our findings show that deployment-aware evaluation and temporal inference stabilization are necessary components of continuously operating edge AI systems intended for real-world sensing deployments. We release a sample annotated streaming video evaluation dataset and full system implementation to support reproducible deployment-centric evaluation.
翻译:在资源受限的边缘硬件上进行持续AI推理会引发传统基准评估难以察觉的部署效应,包括流式视频的时间不稳定性、持续负载下的热节流现象以及任务相关的性能波动。我们提出Edge-TSR,一个面向部署的持续边缘推理系统,可在NVIDIA Jetson Orin Nano上实现持续的路边感知。Edge-TSR集成了检测、跟踪、细粒度分类功能,并引入轻量级轨迹感知时间稳定机制,以可忽略的计算开销提升流式推理一致性。我们的核心发现是:以基准为中心的评估系统性地高估了已部署边缘推理的性能。在三个最先进基线中,从静态图像评估转向真实流式部署时,观察到一致的20-30%性能相对下降。Edge-TSR通过时间推理稳定化解决了这一差距,在持续实时运行条件下,相较于逐帧推理基线,分类精度恢复高达10.16%。我们在多种真实部署场景下评估完整系统,联合表征长时间运行中的推理质量、延迟、吞吐量及热行为。一段26公里路线上55分钟的车载部署显示,单台嵌入式设备无需云端卸载即可在安全热阈值内以16.18 FPS持续运行。我们的研究表明,部署感知评估和时间推理稳定化是面向真实世界感知部署的持续运行边缘AI系统的必要组件。我们发布带注释的流式视频评估数据集样本及完整系统实现,以支持可复现的部署中心化评估。