Video analytics are often performed as cloud services in edge settings, mainly to offload computation, and also in situations where the results are not directly consumed at the video sensors. Sending high-quality video data from the edge devices can be expensive both in terms of bandwidth and power use. In order to build a streaming video analytics pipeline that makes efficient use of these resources, it is therefore imperative to reduce the size of the video stream. Traditional video compression algorithms are unaware of the semantics of the video, and can be both inefficient and harmful for the analytics performance. In this paper, we introduce LtC, a collaborative framework between the video source and the analytics server, that efficiently learns to reduce the video streams within an analytics pipeline. Specifically, LtC uses the full-fledged analytics algorithm at the server as a teacher to train a lightweight student neural network, which is then deployed at the video source. The student network is trained to comprehend the semantic significance of various regions within the videos, which is used to differentially preserve the crucial regions in high quality while the remaining regions undergo aggressive compression. Furthermore, LtC also incorporates a novel temporal filtering algorithm based on feature-differencing to omit transmitting frames that do not contribute new information. Overall, LtC is able to use 28-35% less bandwidth and has up to 45% shorter response delay compared to recently published state of the art streaming frameworks while achieving similar analytics performance.
翻译:视频分析通常在边缘环境中作为云服务执行,主要目的是卸载计算负担,同时适用于结果无需在视频传感器端直接消费的场景。从边缘设备传输高质量视频数据在带宽和功耗方面成本高昂。为构建高效利用这些资源的流式视频分析流水线,必须减小视频流的体积。传统视频压缩算法不理解视频语义,既低效又可能损害分析性能。本文提出LtC——一种视频源与分析服务器之间的协作框架,能够高效学习如何缩减分析流水线中的视频流。具体而言,LtC利用服务器端的全功能分析算法作为教师,训练一个轻量级学生神经网络,并将其部署于视频源端。学生网络经训练以理解视频各区域的语义重要性,据此差异化保护关键区域的高质量,同时对剩余区域进行激进压缩。此外,LtC还引入了一种基于特征差分的新型时域滤波算法,用于省略不贡献新信息的帧传输。总体而言,与近期发布的最先进流式框架相比,LtC能在实现相似分析性能的同时,节省28%-35%的带宽且响应延迟最多缩短45%。