Overlapping cameras offer exciting opportunities to view a scene from different angles, allowing for more advanced, comprehensive and robust analysis. However, existing visual analytics systems for multi-camera streams are mostly limited to (i) per-camera processing and aggregation and (ii) workload-agnostic centralized processing architectures. In this paper, we present Argus, a distributed video analytics system with cross-camera collaboration on smart cameras. We identify multi-camera, multi-target tracking as the primary task of multi-camera video analytics and develop a novel technique that avoids redundant, processing-heavy identification tasks by leveraging object-wise spatio-temporal association in the overlapping fields of view across multiple cameras. We further develop a set of techniques to perform these operations across distributed cameras without cloud support at low latency by (i) dynamically ordering the camera and object inspection sequence and (ii) flexibly distributing the workload across smart cameras, taking into account network transmission and heterogeneous computational capacities. Evaluation of three real-world overlapping camera datasets with two Nvidia Jetson devices shows that Argus reduces the number of object identifications and end-to-end latency by up to 7.13x and 2.19x (4.86x and 1.60x compared to the state-of-the-art), while achieving comparable tracking quality.
翻译:重叠摄像头从不同角度观测同一场景提供了令人兴奋的机会,能够实现更先进、全面且鲁棒的分析。然而,现有面向多摄像头流的视觉分析系统大多局限于:(i) 单摄像头处理与聚合;(ii) 工作负载无关的集中式处理架构。本文提出Argus——一种在智能摄像头上实现跨摄像头协作的分布式视频分析系统。我们将多摄像头多目标跟踪确定为多摄像头视频分析的首要任务,并提出一种新颖技术,通过利用多摄像头重叠视场中逐目标时空关联,避免冗余且处理密集的身份识别任务。我们进一步开发了一套技术,无需云支持即可在分布式摄像头上以低延迟执行这些操作,具体包括:(i) 动态排序摄像头与目标检测顺序;(ii) 考虑网络传输与异构计算能力,将工作负载灵活分配至各智能摄像头。在基于两个Nvidia Jetson设备的三组真实世界重叠摄像头数据集上的评估表明,Argus将目标识别次数与端到端延迟分别降低至1/7.13和1/2.19(相较于最先进方法分别为1/4.86和1/1.60),同时保持了可比的跟踪质量。