This paper proposes a novel edge computing enabled real-time video analysis system for intelligent visual devices. The proposed system consists of a tracking-assisted object detection module (TAODM) and a region of interesting module (ROIM). TAODM adaptively determines the offloading decision to process each video frame locally with a tracking algorithm or to offload it to the edge server inferred by an object detection model. ROIM determines each offloading frame's resolution and detection model configuration to ensure that the analysis results can return in time. TAODM and ROIM interact jointly to filter the repetitive spatial-temporal semantic information to maximize the processing rate while ensuring high video analysis accuracy. Unlike most existing works, this paper investigates the real-time video analysis systems where the intelligent visual device connects to the edge server through a wireless network with fluctuating network conditions. We decompose the real-time video analysis problem into the offloading decision and configurations selection sub-problems. To solve these two sub-problems, we introduce a double deep Q network (DDQN) based offloading approach and a contextual multi-armed bandit (CMAB) based adaptive configurations selection approach, respectively. A DDQN-CMAB reinforcement learning (DCRL) training framework is further developed to integrate these two approaches to improve the overall video analyzing performance. Extensive simulations are conducted to evaluate the performance of the proposed solution, and demonstrate its superiority over counterparts.
翻译:本文提出了一种面向智能视觉设备的边缘计算实时视频分析系统。该系统由跟踪辅助目标检测模块(TAODM)与感兴趣区域模块(ROIM)组成。TAODM自适应决策每帧视频的处理方式:通过跟踪算法在本地处理,或卸载至边缘服务器进行目标检测模型推理。ROIM则根据卸载帧的网络状况,动态调整其分辨率及检测模型配置,确保分析结果能及时返回。TAODM与ROIM协同交互,滤除冗余的时空语义信息,在保证高分析精度的同时最大化处理速率。与现有大多数工作不同,本文研究了智能视觉设备通过无线网络(网络条件动态波动)连接边缘服务器的实时视频分析系统。我们将实时视频分析问题分解为卸载决策与配置选择两个子问题。针对这两个子问题,分别提出了基于双深度Q网络(DDQN)的卸载方法和基于上下文多臂赌博机(CMAB)的自适应配置选择方法。进一步设计了DDQN-CMAB强化学习(DCRL)训练框架来集成这两种方法,以提升整体视频分析性能。大量仿真实验验证了所提方案的有效性,并展示了其相较于对比方法的优越性。