Global contexts in images are quite valuable in image-to-image translation problems. Conventional attention-based and graph-based models capture the global context to a large extent, however, these are computationally expensive. Moreover, the existing approaches are limited to only learning the pairwise semantic relation between any two points on the image. In this paper, we present Latent Graph Attention (LGA) a computationally inexpensive (linear to the number of nodes) and stable, modular framework for incorporating the global context in the existing architectures, especially empowering small-scale architectures to give performance closer to large size architectures, thus making the light-weight architectures more useful for edge devices with lower compute power and lower energy needs. LGA propagates information spatially using a network of locally connected graphs, thereby facilitating to construct a semantically coherent relation between any two spatially distant points that also takes into account the influence of the intermediate pixels. Moreover, the depth of the graph network can be used to adapt the extent of contextual spread to the target dataset, thereby being able to explicitly control the added computational cost. To enhance the learning mechanism of LGA, we also introduce a novel contrastive loss term that helps our LGA module to couple well with the original architecture at the expense of minimal additional computational load. We show that incorporating LGA improves the performance on three challenging applications, namely transparent object segmentation, image restoration for dehazing and optical flow estimation.
翻译:图像中的全局上下文在图像到图像翻译问题中极具价值。传统的基于注意力机制和基于图的模型虽然能在很大程度上捕捉全局上下文,但计算成本高昂。此外,现有方法仅局限于学习图像中任意两点间的成对语义关系。本文提出潜在图注意力(LGA)——一种计算成本低廉(与节点数呈线性关系)且稳定的模块化框架,用于将全局上下文融入现有架构,尤其能赋予小型架构接近大型架构的性能表现,从而使得轻量级架构更适用于计算能力与能耗较低的边缘设备。LGA通过局部连接图网络实现空间信息传播,从而能够在任意空间远距点之间构建语义连贯的关系,并同时考虑中间像素的影响。此外,图网络的深度可用于适配目标数据集的上下文扩展范围,从而显式控制附加的计算成本。为增强LGA的学习机制,我们还引入了一种新颖的对比损失项,该损失项能在极小额外计算负载下帮助LGA模块与原始架构良好耦合。实验表明,在透明物体分割、图像去雾复原及光流估计这三项具有挑战性的应用中,引入LGA均能提升性能表现。