We propose MFT -- Multi-Flow dense Tracker -- a novel method for dense, pixel-level, long-term tracking. The approach exploits optical flows estimated not only between consecutive frames, but also for pairs of frames at logarithmically spaced intervals. It selects the most reliable sequence of flows on the basis of estimates of its geometric accuracy and the probability of occlusion, both provided by a pre-trained CNN. We show that MFT achieves competitive performance on the TAP-Vid benchmark, outperforming baselines by a significant margin, and tracking densely orders of magnitude faster than the state-of-the-art point-tracking methods. The method is insensitive to medium-length occlusions and it is robustified by estimating flow with respect to the reference frame, which reduces drift.
翻译:我们提出MFT——多流密集追踪器——一种用于密集、像素级、长期追踪的新方法。该方法不仅利用连续帧之间的光流,还利用对数间隔帧对的光流。通过基于预训练CNN对几何精度和遮挡概率的估计,选择最可靠的光流序列。我们证明,MFT在TAP-Vid基准测试中实现了具有竞争力的性能,显著优于基线方法,且密集追踪速度比最先进的点追踪方法快数个数量级。该方法对中等长度的遮挡不敏感,并通过估计相对于参考帧的光流来减少漂移,从而增强鲁棒性。