We introduce Warping-Alone Field Transforms (WAFT), a simple and effective method for optical flow. WAFT is similar to RAFT but replaces cost volume with high-resolution warping, achieving better accuracy with lower memory cost. This design challenges the conventional wisdom that constructing cost volumes is necessary for strong performance. WAFT is a simple and flexible meta-architecture with minimal inductive biases and reliance on custom designs. Compared with existing methods, WAFT ranks 1st on Spring, Sintel, and KITTI benchmarks, achieves the best zero-shot generalization on KITTI, while being 1.3-4.1x faster than existing methods that have competitive accuracy (e.g., 1.3x than Flowformer++, 4.1x than CCMR+). Code and model weights are available at \href{https://github.com/princeton-vl/WAFT}{https://github.com/princeton-vl/WAFT}.
翻译:本文提出单扭曲场变换(WAFT),一种用于光流估计的简洁高效方法。WAFT与RAFT架构相似,但通过高分辨率扭曲操作替代了传统代价体构建,在降低内存消耗的同时获得了更高的精度。这一设计挑战了“构建代价体是实现优异性能必要条件”的传统认知。WAFT作为一种简洁灵活的元架构,具有最小的归纳偏置且不依赖定制化设计。与现有方法相比,WAFT在Spring、Sintel和KITTI基准测试中均位列第一,在KITTI上实现了最佳零样本泛化性能,同时其推理速度达到具有竞争性精度方法的1.3-4.1倍(例如:比Flowformer++快1.3倍,比CCMR+快4.1倍)。代码与模型权重已开源:\href{https://github.com/princeton-vl/WAFT}{https://github.com/princeton-vl/WAFT}。