Recent advances in Gaussian Splatting have significantly advanced the field, achieving both panoptic and interactive segmentation of 3D scenes. However, existing methodologies often overlook the critical need for reconstructing specified targets with complex structures from sparse views. To address this issue, we introduce TSGaussian, a novel framework that combines semantic constraints with depth priors to avoid geometry degradation in challenging novel view synthesis tasks. Our approach prioritizes computational resources on designated targets while minimizing background allocation. Bounding boxes from YOLOv9 serve as prompts for Segment Anything Model to generate 2D mask predictions, ensuring semantic accuracy and cost efficiency. TSGaussian effectively clusters 3D gaussians by introducing a compact identity encoding for each Gaussian ellipsoid and incorporating 3D spatial consistency regularization. Leveraging these modules, we propose a pruning strategy to effectively reduce redundancy in 3D gaussians. Extensive experiments demonstrate that TSGaussian outperforms state-of-the-art methods on three standard datasets and a new challenging dataset we collected, achieving superior results in novel view synthesis of specific objects. Code is available at: https://github.com/leon2000-ai/TSGaussian.
翻译:高斯溅射技术的最新进展显著推动了该领域的发展,实现了对三维场景的全景与交互式分割。然而,现有方法往往忽视了从稀疏视角重建具有复杂结构的指定目标这一关键需求。为解决此问题,我们提出了TSGaussian,一种将语义约束与深度先验相结合的新型框架,以避免在具有挑战性的新视角合成任务中出现几何退化。我们的方法优先将计算资源分配给指定目标,同时最小化对背景的分配。利用YOLOv9生成的边界框作为Segment Anything Model的提示,以生成二维掩码预测,从而确保语义准确性并提升计算效率。TSGaussian通过为每个高斯椭球体引入紧凑的身份编码,并结合三维空间一致性正则化,实现了对三维高斯的有效聚类。基于这些模块,我们提出了一种剪枝策略,以有效减少三维高斯中的冗余。大量实验表明,TSGaussian在三个标准数据集及我们收集的一个新挑战性数据集上均优于现有先进方法,在特定对象的新视角合成任务中取得了更优的结果。代码发布于:https://github.com/leon2000-ai/TSGaussian。