This paper considers the problem of detecting and tracking objects in a sequence of images. The problem is formulated in a filtering framework, using the output of object-detection algorithms as measurements. An extension to the filtering formulation is proposed that incorporates class information from the previous frame to robustify the classification, even if the object-detection algorithm outputs an incorrect prediction. Further, the properties of the object-detection algorithm are exploited to quantify the uncertainty of the bounding box detection in each frame. The complete filtering method is evaluated on camera trap images of the four large Swedish carnivores, bear, lynx, wolf, and wolverine. The experiments show that the class tracking formulation leads to a more robust classification.
翻译:本文研究图像序列中目标检测与跟踪的问题。该问题在滤波框架下进行建模,将物体检测算法的输出作为测量值。文中提出了一种滤波公式的扩展方法,该方法引入前一帧的类别信息以增强分类的鲁棒性,即使物体检测算法输出了错误预测也能改善分类效果。此外,本文利用物体检测算法的特性来量化每帧中边界框检测的不确定性。完整的滤波方法在四种大型瑞典食肉动物(棕熊、猞猁、狼和狼獾)的相机陷阱图像上进行了评估。实验表明,类别跟踪公式能够实现更鲁棒的分类。