Accurate semantic segmentation of terrestrial laser scanning (TLS) point clouds is limited by costly manual annotation. We propose a semi-automated, uncertainty-aware pipeline that integrates spherical projection, feature enrichment, ensemble learning, and targeted annotation to reduce labeling effort, while sustaining high accuracy. Our approach projects 3D points to a 2D spherical grid, enriches pixels with multi-source features, and trains an ensemble of segmentation networks to produce pseudo-labels and uncertainty maps, the latter guiding annotation of ambiguous regions. The 2D outputs are back-projected to 3D, yielding densely annotated point clouds supported by a three-tier visualization suite (2D feature maps, 3D colorized point clouds, and compact virtual spheres) for rapid triage and reviewer guidance. Using this pipeline, we build Mangrove3D, a semantic segmentation TLS dataset for mangrove forests. We further evaluate data efficiency and feature importance to address two key questions: (1) how much annotated data are needed and (2) which features matter most. Results show that performance saturates after ~12 annotated scans, geometric features contribute the most, and compact nine-channel stacks capture nearly all discriminative power, with the mean Intersection over Union (mIoU) plateauing at around 0.76. Finally, we confirm the generalization of our feature-enrichment strategy through cross-dataset tests on ForestSemantic and Semantic3D. Our contributions include: (i) a robust, uncertainty-aware TLS annotation pipeline with visualization tools; (ii) the Mangrove3D dataset; and (iii) empirical guidance on data efficiency and feature importance, thus enabling scalable, high-quality segmentation of TLS point clouds for ecological monitoring and beyond. The dataset and processing scripts are publicly available at https://fz-rit.github.io/through-the-lidars-eye/.
翻译:地面激光扫描点云的精确语义分割受限于昂贵的人工标注。本文提出一种半自动化、不确定性感知的标注流程,该流程融合球面投影、特征增强、集成学习与定向标注技术,在保持高精度的同时显著降低标注成本。我们的方法将三维点云投影至二维球面网格,通过多源特征增强像素信息,并训练分割网络集成模型以生成伪标签及不确定性图谱——后者可指导对模糊区域的标注。二维输出结果经反向投影至三维空间,形成密集标注的点云数据,并辅以三层可视化套件(二维特征图谱、三维着色点云及紧凑虚拟球体)以实现快速分诊与评审引导。基于此流程,我们构建了面向红树林生态系统的语义分割TLS数据集Mangrove3D。我们进一步评估数据效率与特征重要性以回答两个关键问题:(1)需要多少标注数据;(2)哪些特征最为重要。结果表明:性能在标注约12个扫描后趋于饱和;几何特征贡献度最高;紧凑的九通道特征堆叠可捕获近乎全部判别能力,平均交并比稳定在0.76左右。最后,通过在ForestSemantic和Semantic3D数据集上的跨数据集测试,验证了特征增强策略的泛化能力。本文贡献包括:(1)配备可视化工具的不确定性感知TLS标注流程;(2)Mangrove3D数据集;(3)关于数据效率与特征重要性的实证指导,从而为生态监测等应用提供可扩展的高质量TLS点云分割方案。数据集与处理脚本已公开于https://fz-rit.github.io/through-the-lidars-eye/。