Traditional unsupervised optical flow methods are vulnerable to occlusions and motion boundaries due to lack of object-level information. Therefore, we propose UnSAMFlow, an unsupervised flow network that also leverages object information from the latest foundation model Segment Anything Model (SAM). We first include a self-supervised semantic augmentation module tailored to SAM masks. We also analyze the poor gradient landscapes of traditional smoothness losses and propose a new smoothness definition based on homography instead. A simple yet effective mask feature module has also been added to further aggregate features on the object level. With all these adaptations, our method produces clear optical flow estimation with sharp boundaries around objects, which outperforms state-of-the-art methods on both KITTI and Sintel datasets. Our method also generalizes well across domains and runs very efficiently.
翻译:传统无监督光流方法因缺乏物体级信息,容易受遮挡和运动边界影响。为此,我们提出UnSAMFlow,一种利用最新基础模型Segment Anything Model(SAM)物体信息的无监督光流网络。首先引入针对SAM掩码的自监督语义增强模块。同时分析传统平滑损失函数梯度分布不良的问题,提出基于单应性的新型平滑定义。此外,添加简洁高效的掩码特征模块以进一步在物体层面聚合特征。通过上述改进,该方法能生成物体边界清晰的光流估计结果,在KITTI和Sintel数据集上均超越现有最优方法,且具备良好的跨域泛化能力与高效运行性能。