Although single object trackers have achieved advanced performance, their large-scale network models make it difficult to apply them on the platforms with limited resources. Moreover, existing lightweight trackers only achieve balance between 2-3 points in terms of parameters, performance, Flops and FPS. To achieve the balance among all 4 points, this paper propose a lightweight full-convolutional Siamese tracker called lightFC. LightFC employs a noval efficient cross-correlation module (ECM) and a noval efficient rep-center head (ERH) to enhance the nonlinear expressiveness of the convoluational tracking pipeline. The ECM adopts an architecture of attention-like module and fuses local spatial and channel features from the pixel-wise correlation fusion features and enhance model nonlinearity with an inversion activation block. Additionally, skip-connections and the reuse of search area features are introduced by the ECM to improve its performance. The ERH reasonably introduces reparameterization technology and channel attention to enhance the nonlinear expressiveness of the center head. Comprehensive experiments show that LightFC achieves a good balance between performance, parameters, Flops and FPS. The precision score of LightFC outperforms MixFormerV2-S by 3.7 \% and 6.5 \% on LaSOT and TNL2K, respectively, while using 5x fewer parameters and 4.6x fewer Flops. Besides, LightFC runs 2x faster than MixFormerV2-S on CPUs. Our code and raw results can be found at https://github.com/LiYunfengLYF/LightFC
翻译:尽管单目标跟踪器已取得先进性能,但其大规模网络模型难以应用于资源受限平台。此外,现有轻量级跟踪器仅在参数、性能、Flops和FPS等2-3个指标间实现平衡。为达成全部4个指标的平衡,本文提出一种名为lightFC的轻量级全卷积孪生跟踪器。LightFC采用新型高效互相关模块(ECM)和高效重参数化中心头(ERH)来增强卷积跟踪流水线的非线性表达能力。ECM采用类注意力模块架构,通过像素级互相关融合特征融合局部空间与通道特征,并利用反演激活块增强模型非线性。此外,ECM引入跳跃连接和搜索区域特征复用以提升性能。ERH合理引入重参数化技术与通道注意力机制,增强中心头的非线性表达能力。综合实验表明,LightFC在性能、参数、Flops和FPS之间实现了良好平衡。在LaSOT和TNL2K数据集上,LightFC的精度得分分别以3.7%和6.5%优于MixFormerV2-S,同时参数减少5倍、Flops降低4.6倍。此外,LightFC在CPU上的运行速度比MixFormerV2-S快2倍。我们的代码和原始结果可在https://github.com/LiYunfengLYF/LightFC获取。