We introduce Uni-Fusion, an universal continuous mapping framework for surfaces, surface properties (color, infrared, etc.) and more (latent features in CLIP embedding space, etc.). We propose the first Universal Implicit Encoding model that supports encoding of both geometry and various types of properties (RGB, infrared, feature and etc.) without the need for any training. Based on that, our framework divides the point cloud into regular grid voxels and produces a latent feature in each voxel to form a Latent Implicit Map (LIM) for geometries and arbitrary properties. Then, by fusing a Local LIM of new frame to Global LIM, an incremental reconstruction is approached. Encoded with corresponding types of data, our Latent Implicit Map is capable to generate continuous surfaces, surface properties fields, surface feature fields and any other possible options. To demonstrate the capabilities of our model, we implement three applications: (1) incremental reconstruction for surfaces and color (2) 2D-to-3D fabricated properties transfers (3) open-vocabulary scene understanding by producing a text CLIP feature field on surfaces. We evaluate Uni-Fusion by comparing in corresponding applications, from which, Uni-Fusion shows high flexibility to various of application while performing best or competitive. The project page of Uni-Fusion is available at https://jarrome.github.io/Uni-Fusion/
翻译:我们提出Uni-Fusion,一个用于表面、表面属性(颜色、红外等)及更多内容(CLIP嵌入空间中的潜在特征等)的通用连续地图构建框架。我们提出了首个通用隐式编码模型,无需任何训练即可支持几何和多种属性(RGB、红外、特征等)的编码。基于此,我们的框架将点云划分为规则网格体素,并在每个体素中生成潜在特征,形成用于几何和任意属性的潜在隐式地图(LIM)。随后,通过将新帧的局部LIM融合到全局LIM中,实现增量式重建。我们的潜在隐式地图能够根据编码的相应数据类型,生成连续表面、表面属性场、表面特征场以及其他可能的选项。为展示模型能力,我们实现了三个应用:(1)表面和颜色的增量式重建;(2)2D到3D的合成属性迁移;(3)通过在表面生成文本CLIP特征场实现开放词汇场景理解。我们通过在相应应用中进行比较来评估Uni-Fusion,结果表明Uni-Fusion对各种应用具有高度灵活性,同时达到最佳或具有竞争力的性能。Uni-Fusion的项目页面见https://jarrome.github.io/Uni-Fusion/。