Surveillance videos and images are used for a broad set of applications, ranging from traffic analysis to crime detection. Extrinsic camera calibration data is important for most analysis applications. However, security cameras are susceptible to environmental conditions and small camera movements, resulting in a need for an automated re-calibration method that can account for these varying conditions. In this paper, we present an automated camera-calibration process leveraging a dictionary-based approach that does not require prior knowledge on any camera settings. The method consists of a custom implementation of a Spatial Transformer Network (STN) and a novel topological loss function. Experiments reveal that the proposed method improves the IoU metric by up to 12% w.r.t. a state-of-the-art model across five synthetic datasets and the World Cup 2014 dataset.
翻译:监控视频和图像被广泛应用于从交通分析到犯罪检测等多个领域。在大多数分析应用中,外部相机标定数据至关重要。然而,安全摄像头易受环境条件影响且存在微小移动,因此需要一种能适应这些变化条件的自动重标定方法。本文提出了一种基于字典的自动化相机标定流程,无需任何相机设置的先验知识。该方法包括自定义实现的空间变换网络(STN)和一种新颖的拓扑损失函数。实验表明,在五个合成数据集和2014年世界杯数据集上,所提方法相比现有最优模型的IoU指标提升了高达12%。