Implicit representation of an image can map arbitrary coordinates in the continuous domain to their corresponding color values, presenting a powerful capability for image reconstruction. Nevertheless, existing implicit representation approaches only focus on building continuous appearance mapping, ignoring the continuities of the semantic information across pixels. As a result, they can hardly achieve desired reconstruction results when the semantic information within input images is corrupted, for example, a large region misses. To address the issue, we propose to learn semantic-aware implicit representation (SAIR), that is, we make the implicit representation of each pixel rely on both its appearance and semantic information (\eg, which object does the pixel belong to). To this end, we propose a framework with two modules: (1) building a semantic implicit representation (SIR) for a corrupted image whose large regions miss. Given an arbitrary coordinate in the continuous domain, we can obtain its respective text-aligned embedding indicating the object the pixel belongs. (2) building an appearance implicit representation (AIR) based on the SIR. Given an arbitrary coordinate in the continuous domain, we can reconstruct its color whether or not the pixel is missed in the input. We validate the novel semantic-aware implicit representation method on the image inpainting task, and the extensive experiments demonstrate that our method surpasses state-of-the-art approaches by a significant margin.
翻译:图像的隐式表示能够将连续域中的任意坐标映射至其对应的颜色值,展现出强大的图像重构能力。然而,现有隐式表示方法仅专注于构建连续的外观映射,忽视了像素间语义信息的连续性。因此,当输入图像中的语义信息受损时(例如大面积区域缺失),这些方法难以获得理想的重构结果。为解决该问题,我们提出学习语义感知的隐式表示(SAIR),即让每个像素的隐式表示同时依赖于其外观与语义信息(如该像素属于哪个物体)。为此,我们提出一个包含两个模块的框架:(1)为存在大面积区域缺失的受损图像构建语义隐式表示(SIR):给定连续域中的任意坐标,可获取指示该像素所属物体的文本对齐嵌入;(2)基于SIR构建外观隐式表示(AIR):给定连续域中的任意坐标,无论输入中该像素是否缺失,均能重构其颜色。我们在图像修复任务上验证了这种新型语义感知隐式表示方法,大量实验表明,我们的方法显著超越了当前最先进的方案。