Most NeRF-based models are designed for learning the entire scene, and complex scenes can lead to longer learning times and poorer rendering effects. This paper utilizes scene semantic priors to make improvements in fast training, allowing the network to focus on the specific targets and not be affected by complex backgrounds. The training speed can be increased by 7.78 times with better rendering effect, and small to medium sized targets can be rendered faster. In addition, this improvement applies to all NeRF-based models. Considering the inherent multi-view consistency and smoothness of NeRF, this paper also studies weak supervision by sparsely sampling negative ray samples. With this method, training can be further accelerated and rendering quality can be maintained. Finally, this paper extends pixel semantic and color rendering formulas and proposes a new scene editing technique that can achieve unique displays of the specific semantic targets or masking them in rendering. To address the problem of unsupervised regions incorrect inferences in the scene, we also designed a self-supervised loop that combines morphological operations and clustering.
翻译:大多数基于NeRF的模型旨在学习整个场景,复杂场景会导致学习时间延长且渲染效果不佳。本文利用场景语义先验进行快速训练优化,使网络能够聚焦于特定目标,而不受复杂背景干扰。在获得更优渲染效果的同时,训练速度可提升7.78倍,中、小尺寸目标的渲染速度尤其显著提升。此外,该改进方法适用于所有基于NeRF的模型。考虑到NeRF固有的多视图一致性和平滑特性,本文还研究了通过稀疏采样负光线样本的弱监督方法。该方法可进一步加速训练并保持渲染质量。最后,本文扩展了像素语义与颜色渲染公式,提出了一种新场景编辑技术,可在渲染中实现特定语义目标的独特显示或遮蔽。针对场景中无监督区域可能出现错误推理的问题,我们设计了一种结合形态学运算与聚类的自监督循环机制。