Real-time online object tracking in videos constitutes a core task in computer vision, with wide-ranging applications including video surveillance, motion capture, and robotics. Deployed tracking systems usually lack formal safety assurances to convey when tracking is reliable and when it may fail, at best relying on heuristic measures of model confidence to raise alerts. To obtain such assurances we propose interpreting object tracking as a sequential hypothesis test, wherein evidence for or against tracking failures is gradually accumulated over time. Leveraging recent advancements in the field, our sequential test (formalized as an e-process) quickly identifies when tracking failures set in whilst provably containing false alerts at a desired rate, and thus limiting potentially costly re-calibration or intervention steps. The approach is computationally light-weight, requires no extra training or fine-tuning, and is in principle model-agnostic. We propose both supervised and unsupervised variants by leveraging either ground-truth or solely internal tracking information, and demonstrate its effectiveness for two established tracking models across four video benchmarks. As such, sequential testing can offer a statistically grounded and efficient mechanism to incorporate safety assurances into real-time tracking systems.
翻译:视频中的实时在线目标跟踪是计算机视觉领域的核心任务,在视频监控、运动捕捉和机器人技术中具有广泛应用。已部署的跟踪系统通常缺乏形式化的安全保障机制来指示跟踪何时可靠、何时可能失效,至多依赖模型置信度的启发式度量来发出警报。为获得此类保障,我们提出将目标跟踪解释为序贯假设检验,其中支持或反对跟踪失败的证据随时间逐渐累积。借助该领域的最新进展,我们的序贯检验(形式化为e过程)能够快速识别跟踪失败的发生,同时以可证明的方式将误报率控制在期望水平,从而限制可能代价高昂的重新校准或干预步骤。该方法计算轻量,无需额外训练或微调,且原则上与模型无关。我们通过利用真实标注数据或仅依赖跟踪器内部信息,分别提出了监督式与非监督式变体,并在四个视频基准测试中针对两种成熟跟踪模型验证了其有效性。因此,序贯检验能够为实时跟踪系统提供一种统计严谨且高效的安全保障集成机制。