The growing adoption of robotics and augmented reality in real-world applications has driven considerable research interest in 3D object detection based on point clouds. While previous methods address unified training across multiple datasets, they fail to model geometric relationships in sparse point cloud scenes and ignore the feature distribution in significant areas, which ultimately restricts their performance. To deal with this issue, a unified 3D indoor detection framework, called UniGeo, is proposed. To model geometric relations in scenes, we first propose a geometry-aware learning module that establishes a learnable mapping from spatial relationships to feature weights, which enabes explicit geometric feature enhancement. Then, to further enhance point cloud feature representation, we propose a dynamic channel gating mechanism that leverages learnable channel-wise weighting. This mechanism adaptively optimizes features generated by the sparse 3D U-Net network, significantly enhancing key geometric information. Extensive experiments on six different indoor scene datasets clearly validate the superior performance of our method.
翻译:随着机器人和增强现实技术在实际应用中的日益普及,基于点云的三维目标检测引起了广泛的研究关注。尽管已有方法致力于实现跨多个数据集的统一训练,但它们未能有效建模稀疏点云场景中的几何关系,并忽略了关键区域的特征分布,这最终限制了其性能。为解决这一问题,本文提出了一种统一的三维室内检测框架,称为UniGeo。为建模场景中的几何关系,我们首先提出了一种几何感知学习模块,该模块建立了从空间关系到特征权重的可学习映射,从而实现了显式的几何特征增强。随后,为进一步增强点云特征表示,我们提出了一种动态通道门控机制,该机制利用可学习的通道级权重分配,自适应地优化由稀疏三维U-Net网络生成的特征,显著增强了关键几何信息。在六个不同室内场景数据集上的大量实验充分验证了本方法的优越性能。