Decomposing a target object from a complex background while reconstructing is challenging. Most approaches acquire the perception for object instances through the use of manual labels, but the annotation procedure is costly. The recent advancements in 2D self-supervised learning have brought new prospects to object-aware representation, yet it remains unclear how to leverage such noisy 2D features for clean decomposition. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to transfer 2D self-supervised features into masks of two levels of granularity to supervise the decomposition, including a binary mask to indicate the foreground regions and a K-cluster mask to indicate the semantically similar regions. These two masks are complementary to each other and lead to robust decomposition. Experimental results show the superiority of DORec in segmenting and reconstructing the foreground object on various datasets.
翻译:从复杂背景中分解出目标物体并同时进行重建是一项具有挑战性的任务。大多数方法通过使用人工标注来获取对物体实例的感知,但标注过程成本高昂。近年来二维自监督学习的进展为物体感知表征带来了新的前景,然而如何利用这种含噪声的二维特征实现干净的分解仍不明确。本文提出了一种基于神经隐式表征的分解式物体重建(DORec)网络。我们的核心思想是将二维自监督特征转化为两个粒度的掩码来监督分解过程:一个用于指示前景区域的二值掩码,以及一个用于指示语义相似区域的K聚类掩码。这两个掩码相互补充,能够实现鲁棒的分解。实验结果表明,DORec在多个数据集的前景物体分割与重建任务中展现出优越性能。