The primal sketch is a fundamental representation in Marr's vision theory, which allows for parsimonious image-level processing from 2D to 2.5D perception. This paper takes a further step by computing 3D primal sketch of wireframes from a set of images with known camera poses, in which we take the 2D wireframes in multi-view images as the basis to compute 3D wireframes in a volumetric rendering formulation. In our method, we first propose a NEural Attraction (NEAT) Fields that parameterizes the 3D line segments with coordinate Multi-Layer Perceptrons (MLPs), enabling us to learn the 3D line segments from 2D observation without incurring any explicit feature correspondences across views. We then present a novel Global Junction Perceiving (GJP) module to perceive meaningful 3D junctions from the NEAT Fields of 3D line segments by optimizing a randomly initialized high-dimensional latent array and a lightweight decoding MLP. Benefitting from our explicit modeling of 3D junctions, we finally compute the primal sketch of 3D wireframes by attracting the queried 3D line segments to the 3D junctions, significantly simplifying the computation paradigm of 3D wireframe parsing. In experiments, we evaluate our approach on the DTU and BlendedMVS datasets with promising performance obtained. As far as we know, our method is the first approach to achieve high-fidelity 3D wireframe parsing without requiring explicit matching.
翻译:原始草图是Marr视觉理论中的基本表征,它允许从2D到2.5D感知的图像级简约处理。本文通过从已知相机位姿的图像集合中计算三维线框草图进一步推进该研究——我们将多视角图像中的二维线框作为基础,在体渲染框架下计算三维线框。方法中,我们首先提出神经吸引场(NEAT),通过坐标多层感知器(MLPs)参数化三维线段,从而无需显式跨视角特征对应即可从二维观测中学习三维线段。接着,我们提出新颖的全局连接点感知(GJP)模块,通过优化随机初始化的高维隐层编码与轻量化解码MLP,从三维线段的NEAT场中感知具有语义意义的三维连接点。得益于对三维连接点的显式建模,我们最终通过将查询的三维线段吸引至三维连接点来生成三维线框的原始草图,显著简化了三维线框解析的计算范式。实验在DTU和BlendedMVS数据集上验证了本方法,取得了令人瞩目的性能。据我们所知,本方法是首个无需显式匹配即可实现高保真三维线框解析的方法。