Video Anomaly Detection using GAN

Accounting for the increased concern for public safety, automatic abnormal event detection and recognition in a surveillance scene is crucial. It is a current open study subject because of its intricacy and utility. The identification of aberrant events automatically, it's a difficult undertaking because everyone's idea of abnormality is different. A typical occurrence in one circumstance could be seen as aberrant in another. Automatic anomaly identification becomes particularly challenging in the surveillance footage with a large crowd due to congestion and high occlusion. With the use of machine learning techniques, this thesis study aims to offer the solution for this use case so that human resources won't be required to keep an eye out for any unusual activity in the surveillance system records. We have developed a novel generative adversarial network (GAN) based anomaly detection model. This model is trained such that it learns together about constructing a high dimensional picture space and determining the latent space from the video's context. The generator uses a residual Autoencoder architecture made up of a multi-stage channel attention-based decoder and a two-stream, deep convolutional encoder that can realise both spatial and temporal data. We have also offered a technique for refining the GAN model that reduces training time while also generalising the model by utilising transfer learning between datasets. Using a variety of assessment measures, we compare our model to the current state-of-the-art techniques on four benchmark datasets. The empirical findings indicate that, in comparison to existing techniques, our network performs favourably on all datasets.

翻译：随着公众安全日益受到关注，在监控场景中自动检测与识别异常事件至关重要。由于问题的复杂性和实用性，这一课题目前仍是开放研究领域。异常事件的自动识别是一项艰巨任务，因为每个人对异常的定义各不相同。一种情境下正常的事件，在另一种情境下可能被视为异常。在人群密集的监控视频中，由于拥堵和高遮挡问题，自动异常识别变得尤为困难。本研究旨在利用机器学习技术为这一应用场景提供解决方案，从而无需人力资源持续监控监控系统记录中的异常活动。我们提出了一种基于生成对抗网络（GAN）的新型异常检测模型。该模型在训练过程中同时学习构建高维图像空间和从视频上下文中确定潜在空间。生成器采用残差自编码器架构，由多阶段通道注意力解码器和双流深度卷积编码器组成，能够同时提取空间和时间数据。我们还提出了一种GAN模型优化技术，通过数据集间的迁移学习来减少训练时间并提升模型泛化能力。通过多种评估指标，我们在四个基准数据集上将我们的模型与当前最先进技术进行了比较。实验结果表明，与现有技术相比，我们的网络在所有数据集上均表现出更优性能。