In recent years, many deep learning-based methods have been proposed to tackle the problem of optical flow estimation and achieved promising results. However, they hardly consider that most videos are compressed and thus ignore the pre-computed information in compressed video streams. Motion vectors, one of the compression information, record the motion of the video frames. They can be directly extracted from the compression code stream without computational cost and serve as a solid prior for optical flow estimation. Therefore, we propose an optical flow model, MVFlow, which uses motion vectors to improve the speed and accuracy of optical flow estimation for compressed videos. In detail, MVFlow includes a key Motion-Vector Converting Module, which ensures that the motion vectors can be transformed into the same domain of optical flow and then be utilized fully by the flow estimation module. Meanwhile, we construct four optical flow datasets for compressed videos containing frames and motion vectors in pairs. The experimental results demonstrate the superiority of our proposed MVFlow, which can reduce the AEPE by 1.09 compared to existing models or save 52% time to achieve similar accuracy to existing models.
翻译:近年来,许多基于深度学习的方法被提出用于解决光流估计问题,并取得了令人瞩目的成果。然而,这些方法鲜少考虑大多数视频为压缩格式,从而忽略了压缩视频流中预计算的信息。运动矢量作为压缩信息之一,记录了视频帧的运动,可直接从压缩码流中提取且无需计算成本,为光流估计提供了可靠先验。为此,我们提出光流模型MVFlow,利用运动矢量提升压缩视频光流估计的速度与精度。具体而言,MVFlow包含一个关键的运动矢量转换模块,确保运动矢量能转换至光流相同域中,并被光流估计模块充分利用。同时,我们构建了四个包含帧与对应运动矢量的压缩视频光流数据集。实验结果表明,所提出的MVFlow具有优越性,相较于现有模型可降低平均端点误差1.09,或在保持与现有模型相近精度的前提下节省52%的时间。