MPI-Flow: Learning Realistic Optical Flow with Multiplane Images

The accuracy of learning-based optical flow estimation models heavily relies on the realism of the training datasets. Current approaches for generating such datasets either employ synthetic data or generate images with limited realism. However, the domain gap of these data with real-world scenes constrains the generalization of the trained model to real-world applications. To address this issue, we investigate generating realistic optical flow datasets from real-world images. Firstly, to generate highly realistic new images, we construct a layered depth representation, known as multiplane images (MPI), from single-view images. This allows us to generate novel view images that are highly realistic. To generate optical flow maps that correspond accurately to the new image, we calculate the optical flows of each plane using the camera matrix and plane depths. We then project these layered optical flows into the output optical flow map with volume rendering. Secondly, to ensure the realism of motion, we present an independent object motion module that can separate the camera and dynamic object motion in MPI. This module addresses the deficiency in MPI-based single-view methods, where optical flow is generated only by camera motion and does not account for any object movement. We additionally devise a depth-aware inpainting module to merge new images with dynamic objects and address unnatural motion occlusions. We show the superior performance of our method through extensive experiments on real-world datasets. Moreover, our approach achieves state-of-the-art performance in both unsupervised and supervised training of learning-based models. The code will be made publicly available at: \url{https://github.com/Sharpiless/MPI-Flow}.

翻译：基于学习的光流估计模型的精确性高度依赖于训练数据集的真实性。当前生成此类数据集的方法要么采用合成数据，要么生成真实感有限的图像。然而，这些数据与真实场景之间的领域差距限制了训练模型在真实应用中的泛化能力。为解决该问题，我们探索从真实世界图像生成真实光流数据集。首先，为生成高真实感的新图像，我们基于单视图图像构建分层深度表示（即多平面图像，MPI），从而生成高真实感的新视角图像。为精确生成与新图像对应的光流图，我们利用相机矩阵和平面深度计算每个平面的光流，并通过体渲染将这些分层光流投影至输出光流图。其次，为确保运动真实性，我们提出独立物体运动模块，可分离MPI中的相机运动与动态物体运动。该模块弥补了基于MPI的单视图方法仅通过相机运动生成光流、未考虑任何物体运动的缺陷。此外，我们设计深度感知修复模块，将新图像与动态物体融合，并处理非自然的运动遮挡。通过在真实数据集上的大量实验，我们展示了所提方法的优越性能。同时，本方法在基于学习的模型的无监督与有监督训练中均实现了最先进性能。代码将开源至：\url{https://github.com/Sharpiless/MPI-Flow}。