State-of-the-art (SOTA) visual object tracking methods have significantly enhanced the autonomy of unmanned aerial vehicles (UAVs). However, in low-light conditions, the presence of irregular real noise from the environments severely degrades the performance of these SOTA methods. Moreover, existing SOTA denoising techniques often fail to meet the real-time processing requirements when deployed as plug-and-play denoisers for UAV tracking. To address this challenge, this work proposes a novel conditional generative denoiser (CGDenoiser), which breaks free from the limitations of traditional deterministic paradigms and generates the noise conditioning on the input, subsequently removing it. To better align the input dimensions and accelerate inference, a novel nested residual Transformer conditionalizer is developed. Furthermore, an innovative multi-kernel conditional refiner is designed to pertinently refine the denoised output. Extensive experiments show that CGDenoiser promotes the tracking precision of the SOTA tracker by 18.18\% on DarkTrack2021 whereas working 5.8 times faster than the second well-performed denoiser. Real-world tests with complex challenges also prove the effectiveness and practicality of CGDenoiser. Code, video demo and supplementary proof for CGDenoier are now available at: \url{https://github.com/vision4robotics/CGDenoiser}.
翻译:当前最先进的视觉目标跟踪方法显著提升了无人机的自主性。然而,在低光照条件下,环境中存在的不规则真实噪声严重降低了这些最先进方法的性能。此外,现有的最先进去噪技术作为即插即用去噪器部署于无人机跟踪时,往往无法满足实时处理需求。为应对这一挑战,本研究提出了一种新颖的条件生成式去噪器,该模型突破了传统确定性范式的限制,通过生成与输入相关的条件噪声并随后将其去除。为更好地对齐输入维度并加速推理,开发了一种新颖的嵌套残差Transformer条件化器。此外,设计了一种创新的多核条件优化器,以针对性优化去噪后的输出。大量实验表明,在DarkTrack2021数据集上,条件生成式去噪器将最先进跟踪器的跟踪精度提升了18.18%,同时运行速度比性能次优的去噪器快5.8倍。包含复杂挑战的真实场景测试也验证了条件生成式去噪器的有效性和实用性。相关代码、演示视频及补充材料已发布于:\url{https://github.com/vision4robotics/CGDenoiser}。