We introduce a novel motion estimation method, MaskFlow, that is capable of estimating accurate motion fields, even in very challenging cases with small objects, large displacements and drastic appearance changes. In addition to lower-level features, that are used in other Deep Neural Network (DNN)-based motion estimation methods, MaskFlow draws from object-level features and segmentations. These features and segmentations are used to approximate the objects' translation motion field. We propose a novel and effective way of incorporating the incomplete translation motion field into a subsequent motion estimation network for refinement and completion. We also produced a new challenging synthetic dataset with motion field ground truth, and also provide extra ground truth for the object-instance matchings and corresponding segmentation masks. We demonstrate that MaskFlow outperforms state of the art methods when evaluated on our new challenging dataset, whilst still producing comparable results on the popular FlyingThings3D benchmark dataset.
翻译:我们提出了一种新颖的运动估计方法MaskFlow,即使在包含小目标、大位移和剧烈外观变化的极具挑战性场景中,也能准确估计运动场。与其他基于深度神经网络(DNN)的运动估计方法所依赖的低层特征不同,MaskFlow利用目标级特征和分割结果。这些特征和分割结果用于近似目标的平移运动场。我们提出了一种新颖有效的方式,将不完整的平移运动场融入后续运动估计网络中进行优化和补全。我们还生成了一个包含运动场真值的全新挑战性合成数据集,并额外提供了目标实例匹配及对应分割掩码的真值。实验表明,在我们新提出的挑战性数据集上,MaskFlow优于现有最先进方法,同时在流行的FlyingThings3D基准数据集上仍能取得相当的结果。