This paper proposes a shape anchor guided learning strategy (AncLearn) for robust holistic indoor scene understanding. We observe that the search space constructed by current methods for proposal feature grouping and instance point sampling often introduces massive noise to instance detection and mesh reconstruction. Accordingly, we develop AncLearn to generate anchors that dynamically fit instance surfaces to (i) unmix noise and target-related features for offering reliable proposals at the detection stage, and (ii) reduce outliers in object point sampling for directly providing well-structured geometry priors without segmentation during reconstruction. We embed AncLearn into a reconstruction-from-detection learning system (AncRec) to generate high-quality semantic scene models in a purely instance-oriented manner. Experiments conducted on the challenging ScanNetv2 dataset demonstrate that our shape anchor-based method consistently achieves state-of-the-art performance in terms of 3D object detection, layout estimation, and shape reconstruction. The code will be available at https://github.com/Geo-Tell/AncRec.
翻译:本文提出了一种形状锚点引导的学习策略(AncLearn),用于鲁棒的全局室内场景理解。我们发现,当前方法在提案特征分组和实例点采样中构建的搜索空间,往往为实例检测和网格重建引入大量噪声。为此,我们开发了AncLearn来生成动态拟合实例表面的锚点,以实现:(i)在检测阶段分离噪声与目标相关特征,从而提供可靠的提案;(ii)在重建过程中减少目标点采样中的离群点,直接提供无需分割的结构化几何先验。我们将AncLearn嵌入到基于检测的重建学习系统(AncRec)中,以纯实例导向的方式生成高质量的语义场景模型。在具有挑战性的ScanNetv2数据集上的实验表明,我们的基于形状锚点的方法在3D目标检测、布局估计和形状重建方面持续达到最先进性能。代码将在https://github.com/Geo-Tell/AncRec 公开。