Vehicle Detection and Classification without Residual Calculation: Accelerating HEVC Image Decoding with Random Perturbation Injection

In the field of video analytics, particularly traffic surveillance, there is a growing need for efficient and effective methods for processing and understanding video data. Traditional full video decoding techniques can be computationally intensive and time-consuming, leading researchers to explore alternative approaches in the compressed domain. This study introduces a novel random perturbation-based compressed domain method for reconstructing images from High Efficiency Video Coding (HEVC) bitstreams, specifically designed for traffic surveillance applications. To the best of our knowledge, our method is the first to propose substituting random perturbations for residual values, creating a condensed representation of the original image while retaining information relevant to video understanding tasks, particularly focusing on vehicle detection and classification as key use cases. By not using residual data, our proposed method significantly reduces the data needed in the image reconstruction process, allowing for more efficient storage and transmission of information. This is particularly important when considering the vast amount of video data involved in surveillance applications. Applied to the public BIT-Vehicle dataset, we demonstrate a significant increase in the reconstruction speed compared to the traditional full decoding approach, with our proposed method being approximately 56% faster than the pixel domain method. Additionally, we achieve a detection accuracy of 99.9%, on par with the pixel domain method, and a classification accuracy of 96.84%, only 0.98% lower than the pixel domain method. Furthermore, we showcase the significant reduction in data size, leading to more efficient storage and transmission. Our research establishes the potential of compressed domain methods in traffic surveillance applications, where speed and data size are critical factors.

翻译：在视频分析领域，特别是交通监控中，对于高效且有效的视频数据处理与理解方法的需求日益增长。传统的全视频解码技术计算量大且耗时，促使研究者探索压缩域中的替代方法。本研究提出一种新颖的基于随机扰动的压缩域方法，用于从高效视频编码（HEVC）比特流中重建图像，并专门针对交通监控应用设计。据我们所知，我们的方法是首个提出用随机扰动替代残差值的方案，从而生成原始图像的压缩表示，同时保留与视频理解任务相关的信息，尤其以车辆检测与分类作为关键用例。通过不使用残差数据，我们的方法显著减少了图像重建过程中所需的数据量，从而实现了更高效的信息存储与传输。这在考虑监控应用中涉及的海量视频数据时尤为重要。在公开的BIT-Vehicle数据集上应用后，我们证明与传统的全解码方法相比，重建速度显著提升，我们的方法比像素域方法快约56%。此外，我们达到了99.9%的检测准确率，与像素域方法持平，以及96.84%的分类准确率，仅比像素域方法低0.98%。进一步地，我们展示了数据量的显著减少，从而实现了更高效的存储与传输。我们的研究确立了压缩域方法在速度与数据量均为关键因素的交通监控应用中的潜力。