To apply optical flow in practice, it is often necessary to resize the input to smaller dimensions in order to reduce computational costs. However, downsizing inputs makes the estimation more challenging because objects and motion ranges become smaller. Even though recent approaches have demonstrated high-quality flow estimation, they tend to fail to accurately model small objects and precise boundaries when the input resolution is lowered, restricting their applicability to high-resolution inputs. In this paper, we introduce AnyFlow, a robust network that estimates accurate flow from images of various resolutions. By representing optical flow as a continuous coordinate-based representation, AnyFlow generates outputs at arbitrary scales from low-resolution inputs, demonstrating superior performance over prior works in capturing tiny objects with detail preservation on a wide range of scenes. We establish a new state-of-the-art performance of cross-dataset generalization on the KITTI dataset, while achieving comparable accuracy on the online benchmarks to other SOTA methods.
翻译:在实际应用中,为降低计算成本,常需将光流输入图像缩放到较小尺寸。然而,缩小输入尺寸会因目标与运动范围变小而增加估计难度。尽管近期方法已能实现高质量光流估计,但当输入分辨率降低时,这些方法往往难以精确建模小尺度目标与精细边界,限制了其对高分辨率输入的适用性。本文提出AnyFlow——一种能对不同分辨率图像进行精确光流估计的鲁棒网络。通过将光流表示为连续坐标基表示,AnyFlow能从低分辨率输入生成任意尺度的输出,在捕捉微小目标与细节保留方面展现出优于先前方法的性能,适用于广泛场景。我们在KITTI数据集上实现了跨数据集泛化的新最优性能,同时在在线基准测试中取得与其他先进方法相当的精度。