Temporal action detection aims to recognize the action category and determine the starting and ending time of each action instance in untrimmed videos. The mixed methods have achieved remarkable performance by simply merging anchor-based and anchor-free approaches. However, there are still two crucial issues in the mixed framework: (1) Brute-force merging and handcrafted anchors design affect the performance and practical application of the mixed methods. (2) A large number of false positives in action category predictions further impact the detection performance. In this paper, we propose a novel Boundary Discretization and Reliable Classification Network (BDRC-Net) that addresses the above issues by introducing boundary discretization and reliable classification modules. Specifically, the boundary discretization module (BDM) elegantly merges anchor-based and anchor-free approaches in the form of boundary discretization, avoiding the handcrafted anchors design required by traditional mixed methods. Furthermore, the reliable classification module (RCM) predicts reliable action categories to reduce false positives in action category predictions. Extensive experiments conducted on different benchmarks demonstrate that our proposed method achieves favorable performance compared with the state-of-the-art. For example, BDRC-Net hits an average mAP of 68.6% on THUMOS'14, outperforming the previous best by 1.5%. The code will be released at https://github.com/zhenyingfang/BDRC-Net.
翻译:时序动作检测旨在从未修剪视频中识别动作类别并确定每个动作实例的起止时间。混合方法通过简单融合基于锚框和无锚框方法取得了显著性能。然而,混合框架仍存在两个关键问题:(1)暴力融合与手工锚框设计影响混合方法的性能及实际应用;(2)动作类别预测中的大量假阳性进一步影响检测性能。本文提出一种新颖的边界离散化与可靠分类网络(BDRC-Net),通过引入边界离散化模块和可靠分类模块解决上述问题。具体而言,边界离散化模块(BDM)以边界离散化形式优雅地融合基于锚框和无锚框方法,避免了传统混合方法所需的手工锚框设计。此外,可靠分类模块(RCM)预测可靠的动作类别以减少动作类别预测中的假阳性。在不同基准上的大量实验表明,与最先进方法相比,本文方法取得了优越性能。例如,BDRC-Net在THUMOS'14数据集上实现了68.6%的平均mAP,较此前最优结果提升1.5%。代码将在https://github.com/zhenyingfang/BDRC-Net开源。