Optical flow, or the estimation of motion fields from image sequences, is one of the fundamental problems in computer vision. Unlike most pixel-wise tasks that aim at achieving consistent representations of the same category, optical flow raises extra demands for obtaining local discrimination and smoothness, which yet is not fully explored by existing approaches. In this paper, we push Gaussian Attention (GA) into the optical flow models to accentuate local properties during representation learning and enforce the motion affinity during matching. Specifically, we introduce a novel Gaussian-Constrained Layer (GCL) which can be easily plugged into existing Transformer blocks to highlight the local neighborhood that contains fine-grained structural information. Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching. Our fully-equipped model, namely Gaussian Attention Flow network (GAFlow), naturally incorporates a series of novel Gaussian-based modules into the conventional optical flow framework for reliable motion analysis. Extensive experiments on standard optical flow datasets consistently demonstrate the exceptional performance of the proposed approach in terms of both generalization ability evaluation and online benchmark testing. Code is available at https://github.com/LA30/GAFlow.
翻译:光流,即从图像序列中估计运动场,是计算机视觉的基本问题之一。与旨在实现同类一致性表征的大多数像素级任务不同,光流对获取局部区分性与平滑性提出了额外需求,而现有方法尚未充分探索这一特性。本文提出将高斯注意力(GA)引入光流模型,以在表征学习过程中强化局部属性,并在匹配过程中增强运动亲和性。具体而言,我们设计了一种新型高斯约束层(GCL),该模块可便捷地嵌入现有Transformer块中,突出包含细粒度结构信息的局部邻域。此外,为实现可靠的运动分析,我们提出了一种高斯引导注意力模块(GGAM),该模块不仅继承高斯分布特性,天然聚焦于各点的邻域范围,还能在匹配过程中着重关注上下文相关区域。完整模型——高斯注意力光流网络(GAFlow)——将一系列新型高斯模块系统性地融入传统光流框架,实现可靠运动分析。在标准光流数据集上的大量实验表明,该方法在泛化能力评估与在线基准测试中均展现出卓越性能。代码开源于https://github.com/LA30/GAFlow。