Decomposing a target object from a complex background while reconstructing is challenging. Most approaches acquire the perception for object instances through the use of manual labels, but the annotation procedure is costly. The recent advancements in 2D self-supervised learning have brought new prospects to object-aware representation, yet it remains unclear how to leverage such noisy 2D features for clean decomposition. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to transfer 2D self-supervised features into masks of two levels of granularity to supervise the decomposition, including a binary mask to indicate the foreground regions and a K-cluster mask to indicate the semantically similar regions. These two masks are complementary to each other and lead to robust decomposition. Experimental results show the superiority of DORec in segmenting and reconstructing the foreground object on various datasets.
翻译:从复杂背景中分解目标物体并重建是一项具有挑战性的任务。多数方法通过使用人工标注来获取对物体实例的感知,然而标注过程成本高昂。近年来二维自监督学习的进展为物体感知表征带来了新前景,但如何利用含噪的二维特征实现清晰分解仍不明确。本文提出一种基于神经隐式表征的分解式物体重建网络(DORec)。关键思想是将二维自监督特征转化为两种粒度的掩码以监督分解过程:用于指示前景区域的二值掩码,以及用于指示语义相似区域的K聚类掩码。这两种掩码互为补充,可实现鲁棒的分解。实验结果表明,DORec在多数据集的前景物体分割与重建任务中展现出优越性能。