SAIR: Learning Semantic-aware Implicit Representation

Implicit representation of an image can map arbitrary coordinates in the continuous domain to their corresponding color values, presenting a powerful capability for image reconstruction. Nevertheless, existing implicit representation approaches only focus on building continuous appearance mapping, ignoring the continuities of the semantic information across pixels. As a result, they can hardly achieve desired reconstruction results when the semantic information within input images is corrupted, for example, a large region misses. To address the issue, we propose to learn semantic-aware implicit representation (SAIR), that is, we make the implicit representation of each pixel rely on both its appearance and semantic information (\eg, which object does the pixel belong to). To this end, we propose a framework with two modules: (1) building a semantic implicit representation (SIR) for a corrupted image whose large regions miss. Given an arbitrary coordinate in the continuous domain, we can obtain its respective text-aligned embedding indicating the object the pixel belongs. (2) building an appearance implicit representation (AIR) based on the SIR. Given an arbitrary coordinate in the continuous domain, we can reconstruct its color whether or not the pixel is missed in the input. We validate the novel semantic-aware implicit representation method on the image inpainting task, and the extensive experiments demonstrate that our method surpasses state-of-the-art approaches by a significant margin.

翻译：图像的隐式表示能够将连续域中的任意坐标映射至其对应的颜色值，展现出强大的图像重构能力。然而，现有隐式表示方法仅专注于构建连续的外观映射，忽视了像素间语义信息的连续性。因此，当输入图像中的语义信息受损时（例如大面积区域缺失），这些方法难以获得理想的重构结果。为解决该问题，我们提出学习语义感知的隐式表示（SAIR），即让每个像素的隐式表示同时依赖于其外观与语义信息（如该像素属于哪个物体）。为此，我们提出一个包含两个模块的框架：（1）为存在大面积区域缺失的受损图像构建语义隐式表示（SIR）：给定连续域中的任意坐标，可获取指示该像素所属物体的文本对齐嵌入；（2）基于SIR构建外观隐式表示（AIR）：给定连续域中的任意坐标，无论输入中该像素是否缺失，均能重构其颜色。我们在图像修复任务上验证了这种新型语义感知隐式表示方法，大量实验表明，我们的方法显著超越了当前最先进的方案。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日