3D single object tracking (SOT) is an important and challenging task for the autonomous driving and mobile robotics. Most existing methods perform tracking between two consecutive frames while ignoring the motion patterns of the target over a series of frames, which would cause performance degradation in the scenes with sparse points. To break through this limitation, we introduce Sequence-to-Sequence tracking paradigm and a tracker named SeqTrack3D to capture target motion across continuous frames. Unlike previous methods that primarily adopted three strategies: matching two consecutive point clouds, predicting relative motion, or utilizing sequential point clouds to address feature degradation, our SeqTrack3D combines both historical point clouds and bounding box sequences. This novel method ensures robust tracking by leveraging location priors from historical boxes, even in scenes with sparse points. Extensive experiments conducted on large-scale datasets show that SeqTrack3D achieves new state-of-the-art performances, improving by 6.00% on NuScenes and 14.13% on Waymo dataset. The code will be made public at https://github.com/aron-lin/seqtrack3d.
翻译:三维单目标跟踪(SOT)是自动驾驶和移动机器人领域中一项重要且具有挑战性的任务。现有大多数方法仅基于连续两帧之间进行跟踪,忽略了目标在一系列帧中的运动模式,这会在点云稀疏的场景中导致性能下降。为突破这一局限,我们引入了序列到序列(Sequence-to-Sequence)跟踪范式,并提出了名为SeqTrack3D的跟踪器,以捕捉目标在连续帧间的运动。与先前主要采用三种策略(匹配两帧连续点云、预测相对运动、或利用序列点云解决特征退化)的方法不同,我们的SeqTrack3D结合了历史点云与边界框序列。这种新颖方法能够利用历史边界框的位置先验信息,即使在点云稀疏的场景中也能实现鲁棒跟踪。在大规模数据集上的大量实验表明,SeqTrack3D达到了新的最先进性能,在NuScenes数据集上提升6.00%,在Waymo数据集上提升14.13%。代码将开源在https://github.com/aron-lin/seqtrack3d。