Waymo开放运动数据集能否支持真实行为建模？基于自然轨迹的验证研究 (Can the Waymo Open Motion Dataset Support Realistic Behavioral Modeling? A Validation Study with Naturalistic Trajectories)

The Waymo Open Motion Dataset (WOMD) has become a popular resource for data-driven modeling of autonomous vehicles (AVs) behavior. However, its validity for behavioral analysis remains uncertain due to proprietary post-processing, the absence of error quantification, and the segmentation of trajectories into 20-second clips. This study examines whether WOMD accurately captures the dynamics and interactions observed in real-world AV operations. Leveraging an independently collected naturalistic dataset from Level 4 AV operations in Phoenix, Arizona (PHX), we perform comparative analyses across three representative urban driving scenarios: discharging at signalized intersections, car-following, and lane-changing behaviors. For the discharging analysis, headways are manually extracted from aerial video to ensure negligible measurement error. For the car-following and lane-changing cases, we apply the Simulation-Extrapolation (SIMEX) method to account for empirically estimated error in the PHX data and use Dynamic Time Warping (DTW) distances to quantify behavioral differences. Results across all scenarios consistently show that behavior in PHX falls outside the behavioral envelope of WOMD. Notably, WOMD underrepresents short headways and abrupt decelerations. These findings suggest that behavioral models calibrated solely on WOMD may systematically underestimate the variability, risk, and complexity of naturalistic driving. Caution is therefore warranted when using WOMD for behavior modeling without proper validation against independently collected data.

翻译：Waymo开放运动数据集已成为自动驾驶车辆行为数据驱动建模的流行资源。然而，由于专有的后处理流程、误差量化的缺失以及轨迹被分割为20秒片段，其用于行为分析的有效性仍不确定。本研究检验WOMD是否准确捕捉了真实世界自动驾驶操作中观察到的动态与交互特性。利用从亚利桑那州凤凰城4级自动驾驶运营中独立采集的自然驾驶数据集，我们在三种典型城市驾驶场景下进行对比分析：信号交叉口排队消散、跟驰行为及换道行为。针对排队消散分析，通过人工提取航拍视频中的车头时距以确保测量误差可忽略不计。对于跟驰与换道场景，我们采用仿真外推法处理PHX数据中经验估计的测量误差，并利用动态时间规整距离量化行为差异。所有场景的结果一致表明：PHX数据集的行为特征均超出WOMD的行为表征范围。值得注意的是，WOMD未能充分表征短车头时距与紧急减速行为。这些发现表明，仅基于WOMD校准的行为模型可能系统性地低估自然驾驶的变异性、风险性与复杂性。因此，在使用WOMD进行行为建模时，若未通过独立采集数据进行充分验证，需保持审慎态度。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

自动驾驶中的轨迹预测大型基础模型：全面综述

专知会员服务

16+阅读 · 2025年9月18日

自动驾驶开源数据体系：现状与未来

专知会员服务

41+阅读 · 2024年1月28日

100多位作者！具身智能人进展！谷歌 DeepMind等机构推出《开放 X-实体化：机器人学习数据集与 RT-X 模型》论文

专知会员服务

60+阅读 · 2023年10月10日

多模态数据的行为识别综述

专知会员服务

88+阅读 · 2022年11月30日