3D geometric shape completion hinges on representation learning and a deep understanding of geometric data. Without profound insights into the three-dimensional nature of the data, this task remains unattainable. Our work addresses this challenge of 3D shape completion given partial observations by proposing a transformer operating on the latent space representing Signed Distance Fields (SDFs). Instead of a monolithic volume, the SDF of an object is partitioned into smaller high-resolution patches leading to a sequence of latent codes. The approach relies on a smooth latent space encoding learned via a variational autoencoder (VAE), trained on millions of 3D patches. We employ an efficient masked autoencoder transformer to complete partial sequences into comprehensive shapes in latent space. Our approach is extensively evaluated on partial observations from ShapeNet and the ABC dataset where only fractions of the objects are given. The proposed POC-SLT architecture compares favorably with several baseline state-of-the-art methods, demonstrating a significant improvement in 3D shape completion, both qualitatively and quantitatively.
翻译:三维几何形状补全依赖于表示学习和对几何数据的深入理解。若缺乏对数据三维本质的深刻洞察,该任务将无法实现。针对给定部分观测的三维形状补全挑战,本研究提出一种在表示有向距离场(SDF)的潜在空间上运行的变换器模型。该方法将物体的SDF分割为高分辨率小块而非整体体积,从而形成潜在编码序列。该方案依赖于通过变分自编码器(VAE)学习到的平滑潜在空间编码,该编码器在数百万个三维小块上进行训练。我们采用高效的掩码自编码变换器,将潜在空间中的部分序列补全为完整形状。我们在仅提供物体局部片段的ShapeNet和ABC数据集的部分观测数据上进行了广泛评估。所提出的POC-SLT架构相较于多种先进的基线方法表现出优越性能,在定性和定量层面均展现出三维形状补全能力的显著提升。