This study introduces a deep learning-based framework for forecasting weather-related traffic crash risk using heterogeneous spatiotemporal data. Given the complex, non-linear relationship between crash occurrence and factors such as road characteristics, and traffic conditions, we propose an ensemble of Convolutional Long Short-Term Memory (ConvLSTM) models trained over overlapping spatial grids. This approach captures both spatial dependencies and temporal dynamics while addressing spatial heterogeneity in crash patterns. North Carolina was selected as the study area due to its diverse weather conditions, with historical crash, weather, and traffic data aggregated at 5-mi by 5-mi grid resolution. The framework was evaluated using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and spatial cross-K analysis. Results show that the ensembled ConvLSTM significantly outperforms baseline models, including linear regression, ARIMA, and standard ConvLSTM, particularly in high-risk zones. The ensemble approach effectively combines the strengths of multiple ConvLSTM models, resulting in lower MSE and RMSE values across all regions, particularly when data from different crash risk zones are aggregated. Notably, the model performs exceptionally well in volatile high-risk areas (Cluster 1), achieving the lowest MSE and RMSE, while in stable low-risk areas (Cluster 2), it still improves upon simpler models but with slightly higher errors due to challenges in capturing subtle variations.
翻译:本研究提出了一种基于深度学习的框架,利用异构时空数据预测天气相关的交通事故风险。鉴于事故发生率与道路特征、交通状况等因素之间存在复杂的非线性关系,我们提出了一种在重叠空间网格上训练的卷积长短期记忆(ConvLSTM)模型集成方法。该方法能够捕捉空间依赖性和时间动态,同时处理事故模式中的空间异质性。由于北卡罗来纳州天气条件多样,本研究将其选为研究区域,并采用5英里×5英里的网格分辨率聚合了历史事故数据、天气数据和交通数据。该框架使用均方误差(MSE)、均方根误差(RMSE)和空间交叉K分析进行评估。结果表明,集成ConvLSTM模型显著优于线性回归、ARIMA和标准ConvLSTM等基线模型,尤其是在高风险区域。集成方法有效结合了多个ConvLSTM模型的优势,在所有区域均实现了更低的MSE和RMSE值,特别是在聚合不同事故风险区域的数据时效果更为明显。值得注意的是,该模型在波动性高风险区域(集群1)表现尤为突出,取得了最低的MSE和RMSE;而在稳定的低风险区域(集群2),虽然仍优于简单模型,但由于捕捉细微变化的挑战,其误差略高。