We propose a new architecture for real-time anomaly detection in video data, inspired by human behavior by combining spatial and temporal analyses. This approach uses two distinct models: for temporal analysis, a recurrent convolutional network (CNN + RNN) is employed, associating VGG19 and a GRU to process video sequences. Regarding spatial analysis, it is performed using YOLOv7 to analyze individual images. These two analyses can be carried out either in parallel, with a final prediction that combines the results of both analyses, or in series, where the spatial analysis enriches the data before the temporal analysis. In this article, we will compare these two architectural configurations with each other, to evaluate the effectiveness of our hybrid approach in video anomaly detection.
翻译:我们提出一种新的实时视频数据异常检测架构,其灵感来源于人类行为,通过结合空间和时间分析来实现。该方法采用两种不同的模型:对于时间分析,使用循环卷积网络(CNN + RNN),关联VGG19和GRU来处理视频序列;对于空间分析,则使用YOLOv7来分析单帧图像。这两种分析可以并行执行,通过结合两者的结果进行最终预测,也可以串联执行,即空间分析先对数据进行增强,再进行时间分析。在本文中,我们将比较这两种架构配置,以评估我们的混合方法在视频异常检测中的有效性。