In this paper, we propose a pipeline for real-time video denoising with low runtime cost and high perceptual quality. The vast majority of denoising studies focus on image denoising. However, a minority of research works focusing on video denoising do so with higher performance costs to obtain higher quality while maintaining temporal coherence. The approach we introduce in this paper leverages the advantages of both image and video-denoising architectures. Our pipeline first denoises the keyframes or one-fifth of the frames using HI-GAN blind image denoising architecture. Then, the remaining four-fifths of the noisy frames and the denoised keyframe data are fed into the FastDVDnet video denoising model. The final output is rendered in the user's display in real-time. The combination of these low-latency neural network architectures produces real-time denoising with high perceptual quality with applications in video conferencing and other real-time media streaming systems. A custom noise detector analyzer provides real-time feedback to adapt the weights and improve the models' output.
翻译:本文提出了一种具有低运行时成本和高感知质量的实时视频去噪流水线。绝大多数去噪研究聚焦于图像去噪。然而,少数针对视频去噪的研究工作为在保持时间一致性的同时获得更高质量,往往需要更高的性能开销。本文提出的方法融合了图像与视频去噪架构的优势。我们的流水线首先使用HI-GAN盲图像去噪架构对关键帧或五分之一帧进行去噪处理。随后,剩余五分之四的含噪帧与已去噪关键帧数据被输入FastDVDnet视频去噪模型。最终输出以实时方式呈现在用户显示器上。这些低延迟神经网络架构的组合在视频会议及其他实时媒体流系统中实现了具有高感知质量的实时去噪。一个自定义噪声检测分析器能够提供实时反馈以调整权重并改进模型输出。