Multi-task learning based video anomaly detection methods combine multiple proxy tasks in different branches to detect video anomalies in different situations. Most existing methods either do not combine complementary tasks to effectively cover all motion patterns, or the class of the objects is not explicitly considered. To address the aforementioned shortcomings, we propose a novel multi-task learning based method that combines complementary proxy tasks to better consider the motion and appearance features. We combine the semantic segmentation and future frame prediction tasks in a single branch to learn the object class and consistent motion patterns, and to detect respective anomalies simultaneously. In the second branch, we added several attention mechanisms to detect motion anomalies with attention to object parts, the direction of motion, and the distance of the objects from the camera. Our qualitative results show that the proposed method considers the object class effectively and learns motion with attention to the aforementioned important factors which results in a precise motion modeling and a better motion anomaly detection. Additionally, quantitative results show the superiority of our method compared with state-of-the-art methods.
翻译:基于多任务学习的视频异常检测方法通过在不同分支中组合多个代理任务,以检测不同场景下的视频异常。现有方法大多未能有效组合互补任务以全面覆盖运动模式,或者未显式考虑目标类别。针对上述不足,我们提出一种新颖的基于多任务学习方法,通过组合互补代理任务更好地融合运动与外观特征。我们在单一分支中结合语义分割与未来帧预测任务,以学习目标类别与一致的运动模式,并同时检测相应异常。在第二分支中,我们引入多种注意力机制,通过关注目标部位、运动方向及目标与摄像头的距离来检测运动异常。定性结果表明,所提方法有效利用了目标类别,并基于上述关键因素通过注意力机制学习运动特征,从而实现精确的运动建模与更优的运动异常检测。此外,定量结果证明了该方法相较于现有先进方法的优越性。