In recent years, many deep learning-based methods have been proposed to tackle the problem of optical flow estimation and achieved promising results. However, they hardly consider that most videos are compressed and thus ignore the pre-computed information in compressed video streams. Motion vectors, one of the compression information, record the motion of the video frames. They can be directly extracted from the compression code stream without computational cost and serve as a solid prior for optical flow estimation. Therefore, we propose an optical flow model, MVFlow, which uses motion vectors to improve the speed and accuracy of optical flow estimation for compressed videos. In detail, MVFlow includes a key Motion-Vector Converting Module, which ensures that the motion vectors can be transformed into the same domain of optical flow and then be utilized fully by the flow estimation module. Meanwhile, we construct four optical flow datasets for compressed videos containing frames and motion vectors in pairs. The experimental results demonstrate the superiority of our proposed MVFlow, which can reduce the AEPE by 1.09 compared to existing models or save 52% time to achieve similar accuracy to existing models.
翻译:近年来,基于深度学习的光流估计方法被广泛提出并取得了显著成果。然而,这些方法很少考虑大多数视频是经过压缩处理的,从而忽略了压缩视频流中预计算的信息。运动矢量作为压缩信息之一,记录了视频帧的运动情况,可直接从压缩码流中提取且无需计算成本,为光流估计提供了可靠的先验。为此,我们提出光流模型MVFlow,利用运动矢量提升压缩视频光流估计的速度与精度。具体而言,MVFlow包含一个关键的运动矢量转换模块,该模块确保运动矢量能够被转换至与光流相同的域,进而被光流估计模块充分利用。同时,我们构建了四个包含帧与运动矢量配对的压缩视频光流数据集。实验结果表明,所提出的MVFlow具有优越性:与现有模型相比,其AEPE可降低1.09,或在达到相似精度时节省52%的计算时间。