3D single object tracking is essential in autonomous driving and robotics. Existing methods often struggle with sparse and incomplete point cloud scenarios. To address these limitations, we propose a Multimodal-guided Virtual Cues Projection (MVCP) scheme that generates virtual cues to enrich sparse point clouds. Additionally, we introduce an enhanced tracker MVCTrack based on the generated virtual cues. Specifically, the MVCP scheme seamlessly integrates RGB sensors into LiDAR-based systems, leveraging a set of 2D detections to create dense 3D virtual cues that significantly improve the sparsity of point clouds. These virtual cues can naturally integrate with existing LiDAR-based 3D trackers, yielding substantial performance gains. Extensive experiments demonstrate that our method achieves competitive performance on the NuScenes dataset.
翻译:三维单目标跟踪在自动驾驶与机器人技术中至关重要。现有方法在稀疏与不完整点云场景中往往表现不佳。为应对这些局限性,我们提出了一种多模态引导虚拟线索投影方案,通过生成虚拟线索来丰富稀疏点云。此外,我们基于生成的虚拟线索提出了一种增强型跟踪器MVCTrack。具体而言,该方案将RGB传感器无缝集成至基于LiDAR的系统中,利用一组二维检测结果生成密集的三维虚拟线索,从而显著改善点云的稀疏性问题。这些虚拟线索能够自然地与现有基于LiDAR的三维跟踪器相结合,带来显著的性能提升。大量实验表明,我们的方法在NuScenes数据集上取得了具有竞争力的性能。