The Waymo Open Motion Dataset (WOMD) has become a popular resource for data-driven modeling of autonomous vehicles (AVs) behavior. However, its validity for behavioral analysis remains uncertain due to proprietary post-processing, the absence of error quantification, and the segmentation of trajectories into 20-second clips. This study examines whether WOMD accurately captures the dynamics and interactions observed in real-world AV operations. Leveraging an independently collected naturalistic dataset from Level 4 AV operations in Phoenix, Arizona (PHX), we perform comparative analyses across three representative urban driving scenarios: discharging at signalized intersections, car-following, and lane-changing behaviors. For the discharging analysis, headways are manually extracted from aerial video to ensure negligible measurement error. For the car-following and lane-changing cases, we apply the Simulation-Extrapolation (SIMEX) method to account for empirically estimated error in the PHX data and use Dynamic Time Warping (DTW) distances to quantify behavioral differences. Results across all scenarios consistently show that behavior in PHX falls outside the behavioral envelope of WOMD. Notably, WOMD underrepresents short headways and abrupt decelerations. These findings suggest that behavioral models calibrated solely on WOMD may systematically underestimate the variability, risk, and complexity of naturalistic driving. Caution is therefore warranted when using WOMD for behavior modeling without proper validation against independently collected data.
翻译:Waymo开放运动数据集已成为自动驾驶车辆行为数据驱动建模的流行资源。然而,由于专有的后处理流程、误差量化的缺失以及轨迹被分割为20秒片段,其用于行为分析的有效性仍不确定。本研究检验WOMD是否准确捕捉了真实世界自动驾驶操作中观察到的动态与交互特性。利用从亚利桑那州凤凰城4级自动驾驶运营中独立采集的自然驾驶数据集,我们在三种典型城市驾驶场景下进行对比分析:信号交叉口排队消散、跟驰行为及换道行为。针对排队消散分析,通过人工提取航拍视频中的车头时距以确保测量误差可忽略不计。对于跟驰与换道场景,我们采用仿真外推法处理PHX数据中经验估计的测量误差,并利用动态时间规整距离量化行为差异。所有场景的结果一致表明:PHX数据集的行为特征均超出WOMD的行为表征范围。值得注意的是,WOMD未能充分表征短车头时距与紧急减速行为。这些发现表明,仅基于WOMD校准的行为模型可能系统性地低估自然驾驶的变异性、风险性与复杂性。因此,在使用WOMD进行行为建模时,若未通过独立采集数据进行充分验证,需保持审慎态度。