High-speed autonomous racing presents extreme perception challenges, including large relative velocities and substantial domain shifts from conventional urban-driving datasets. Existing benchmarks do not adequately capture these high-dynamic conditions. We introduce EagleVision, a unified LiDAR-based multi-task benchmark for 3D detection and trajectory prediction in high-speed racing, providing newly annotated 3D bounding boxes for the Indy Autonomous Challenge dataset (14,893 frames) and the A2RL Real competition dataset (1,163 frames), together with 12,000 simulator-generated annotated frames, all standardized under a common evaluation protocol. Using a dataset-centric transfer framework, we quantify cross-domain generalization across urban, simulator, and real racing domains. Urban pretraining improves detection over scratch training (NDS 0.72 vs. 0.69), while intermediate pretraining on real racing data achieves the best transfer to A2RL (NDS 0.726), outperforming simulator-only adaptation. For trajectory prediction, Indy-trained models surpass in-domain A2RL training on A2RL test sequences (FDE 0.947 vs. 1.250), highlighting the role of motion-distribution coverage in cross-domain forecasting. EagleVision enables systematic study of perception generalization under extreme high-speed dynamics. The dataset and benchmark are publicly available at https://avlab.io/EagleVision
翻译:高速自主赛车带来了极端的感知挑战,包括相较于传统城市驾驶数据集较大的相对速度和显著的领域偏移。现有基准未能充分捕捉这些高动态条件。我们提出EagleVision,一个统一的基于激光雷达的多任务基准,用于高速赛车中的3D检测和轨迹预测,为Indy自主挑战赛数据集(14,893帧)和A2RL真实比赛数据集(1,163帧)提供了新标注的3D边界框,并包含12,000帧模拟器生成的标注帧,所有数据均在统一评估协议下标准化。通过以数据集为中心的迁移框架,我们量化了城市、模拟器和真实赛车领域间的跨领域泛化能力。相较于从头训练(NDS 0.72对比0.69),城市预训练能提升检测性能;而基于真实赛车数据的中间预训练则实现了向A2RL的最佳迁移(NDS 0.726),性能优于仅依赖模拟器进行领域适应。对于轨迹预测,基于Indy训练的模型在A2RL测试序列上超越了领域内A2RL训练结果(FDE 0.947对比1.250),凸显了运动分布覆盖在跨领域预测中的关键作用。EagleVision实现了对极端高速动态条件下感知泛化的系统性研究。该数据集和基准已公开于https://avlab.io/EagleVision