We consider the problem of traffic accident analysis on a road network based on road network connections and traffic volume. Previous works have designed various deep-learning methods using historical records to predict traffic accident occurrences. However, there is a lack of consensus on how accurate existing methods are, and a fundamental issue is the lack of public accident datasets for comprehensive evaluations. This paper constructs a large-scale, unified dataset of traffic accident records from official reports of various states in the US, totaling 9 million records, accompanied by road networks and traffic volume reports. Using this new dataset, we evaluate existing deep-learning methods for predicting the occurrence of accidents on road networks. Our main finding is that graph neural networks such as GraphSAGE can accurately predict the number of accidents on roads with less than 22% mean absolute error (relative to the actual count) and whether an accident will occur or not with over 87% AUROC, averaged over states. We achieve these results by using multitask learning to account for cross-state variabilities (e.g., availability of accident labels) and transfer learning to combine traffic volume with accident prediction. Ablation studies highlight the importance of road graph-structural features, amongst other features. Lastly, we discuss the implications of the analysis and develop a package for easily using our new dataset.
翻译:我们研究了基于道路网络连接和交通流量的道路交通事故分析问题。已有工作利用历史记录设计了多种深度学习方法预测交通事故发生,然而现有方法的准确性缺乏共识,核心问题在于缺乏用于全面评估的公开事故数据集。本文构建了一个大规模、统一的事故记录数据集,整合了美国各州官方报告的900万条记录,并配套提供道路网络与交通流量报告。基于该新数据集,我们评估了现有深度学习方法在道路网络事故预测中的表现。主要发现是:图神经网络(如GraphSAGE)能够以低于22%的平均绝对误差(相对于实际数值)准确预测道路事故数量,并以超过87%的AUROC(各州平均值)准确预测事故是否发生。我们通过多任务学习处理跨州差异性(如事故标注可用性),并采用迁移学习融合交通流量与事故预测,从而取得了这些结果。消融实验凸显了道路图结构特征等要素的重要性。最后,我们讨论了分析结论的实际意义,并开发了便于使用新数据集的工具包。