Predicting the trajectory of an ego vehicle is a critical component of autonomous driving systems. Current state-of-the-art methods typically rely on Deep Neural Networks (DNNs) and sequential models to process front-view images for future trajectory prediction. However, these approaches often struggle with perspective issues affecting object features in the scene. To address this, we advocate for the use of Bird's Eye View (BEV) perspectives, which offer unique advantages in capturing spatial relationships and object homogeneity. In our work, we leverage Graph Neural Networks (GNNs) and positional encoding to represent objects in a BEV, achieving competitive performance compared to traditional DNN-based methods. While the BEV-based approach loses some detailed information inherent to front-view images, we balance this by enriching the BEV data by representing it as a graph where relationships between the objects in a scene are captured effectively.
翻译:自车轨迹预测是自动驾驶系统的关键组成部分。当前最先进的方法通常依赖深度神经网络(DNN)和序列模型处理前视图像以实现未来轨迹预测。然而,这些方法常因视角问题影响场景中物体的特征。为解决此问题,我们主张采用鸟瞰视角(BEV),该视角在捕捉空间关系和物体同质性方面具有独特优势。在我们的工作中,利用图神经网络(GNN)和位置编码在BEV中表示物体,与传统基于DNN的方法相比取得了具有竞争力的性能。尽管基于BEV的方法丢失了前视图像固有的部分细节信息,我们通过将BEV数据表示为图来丰富其内容,从而有效捕捉场景中物体间的关系,以此平衡这一不足。