Reliable perception of spatial and motion information is crucial for safe autonomous navigation. Traditional approaches typically fall into two categories: object-centric and class-agnostic methods. While object-centric methods often struggle with missed detections, leading to inaccuracies in motion prediction, many class-agnostic methods focus heavily on encoder design, often overlooking important priors like rigidity and temporal consistency, leading to suboptimal performance, particularly with sparse LiDAR data at distant region. To address these issues, we propose $\textbf{PriorMotion}$, a generative framework that extracts rasterized and vectorized scene representations to model spatio-temporal priors. Our model comprises a BEV encoder, an Raster-Vector prior Encoder, and a Spatio-Temporal prior Generator, improving both spatial and temporal consistency in motion prediction. Additionally, we introduce a standardized evaluation protocol for class-agnostic motion prediction. Experiments on the nuScenes dataset show that PriorMotion achieves state-of-the-art performance, with further validation on advanced FMCW LiDAR confirming its robustness.
翻译:可靠的空间与运动信息感知对于安全自主导航至关重要。传统方法通常分为两类:以对象为中心的方法和类别无关的方法。以对象为中心的方法常因漏检而导致运动预测不准确,而许多类别无关方法则过度聚焦于编码器设计,往往忽视了刚性和时间一致性等重要先验,导致性能欠佳,尤其在远距离稀疏LiDAR数据场景下。为解决这些问题,我们提出$\textbf{PriorMotion}$——一种提取栅格化与矢量化场景表征以建模时空先验的生成式框架。该模型包含BEV编码器、栅格-矢量先验编码器以及时空先验生成器,显著提升了运动预测的空间与时间一致性。此外,我们为类别无关运动预测引入了标准化评估协议。在nuScenes数据集上的实验表明,PriorMotion实现了最先进的性能,在先进FMCW LiDAR上的进一步验证亦证实了其鲁棒性。