We propose a new architecture for real-time anomaly detection in video data, inspired by human behavior by combining spatial and temporal analyses. This approach uses two distinct models: for temporal analysis, a recurrent convolutional network (CNN + RNN) is employed, associating VGG19 and a GRU to process video sequences. Regarding spatial analysis, it is performed using YOLOv7 to analyze individual images. These two analyses can be carried out either in parallel, with a final prediction that combines the results of both analyses, or in series, where the spatial analysis enriches the data before the temporal analysis. In this article, we will compare these two architectural configurations with each other, to evaluate the effectiveness of our hybrid approach in video anomaly detection.
翻译:我们提出了一种用于视频数据实时异常检测的新架构,该架构受人类行为启发,结合了空间与时间分析。该方法采用两种不同的模型:对于时间分析,使用循环卷积网络(CNN + RNN),将VGG19与GRU结合以处理视频序列。关于空间分析,则使用YOLOv7来分析单个图像。这两种分析可以并行执行,通过综合两种分析的结果得出最终预测;也可以串联执行,即空间分析先对数据进行增强,再进行时间分析。在本文中,我们将比较这两种架构配置,以评估我们的混合方法在视频异常检测中的有效性。