Integrating Large Language Models for Severity Classification in Traffic Incident Management: A Machine Learning Approach

This study evaluates the impact of large language models on enhancing machine learning processes for managing traffic incidents. It examines the extent to which features generated by modern language models improve or match the accuracy of predictions when classifying the severity of incidents using accident reports. Multiple comparisons performed between combinations of language models and machine learning algorithms, including Gradient Boosted Decision Trees, Random Forests, and Extreme Gradient Boosting. Our research uses both conventional and language model-derived features from texts and incident reports, and their combinations to perform severity classification. Incorporating features from language models with those directly obtained from incident reports has shown to improve, or at least match, the performance of machine learning techniques in assigning severity levels to incidents, particularly when employing Random Forests and Extreme Gradient Boosting methods. This comparison was quantified using the F1-score over uniformly sampled data sets to obtain balanced severity classes. The primary contribution of this research is in the demonstration of how Large Language Models can be integrated into machine learning workflows for incident management, thereby simplifying feature extraction from unstructured text and enhancing or matching the precision of severity predictions using conventional machine learning pipeline. The engineering application of this research is illustrated through the effective use of these language processing models to refine the modelling process for incident severity classification. This work provides significant insights into the application of language processing capabilities in combination with traditional data for improving machine learning pipelines in the context of classifying incident severity.

翻译：本研究评估了大规模语言模型在增强交通事故管理机器学习流程中的效能。通过事故报告文本，探究现代语言模型生成的特征在提升或匹配事故严重程度分类预测精度方面的效果。研究对包括梯度提升决策树、随机森林及极端梯度提升在内的语言模型与机器学习算法组合开展了多维对比实验。我们分别采用传统词袋特征、语言模型衍生特征及两者融合特征进行严重程度分类。实验证明：将语言模型特征与事故报告直接获取的特征相结合，可提升或至少持平机器学习模型在事故定级中的表现，特别是采用随机森林与极端梯度提升算法时。研究采用均匀采样数据集上的F1分数进行量化比较，以获取平衡的严重等级分布。本研究的核心贡献在于展示了如何将大规模语言模型整合至事故管理机器学习流程，从而简化非结构化文本的特征提取过程，并在保持传统机器学习流程便捷性的基础上提升或保持事故严重程度预测精度。研究通过有效运用语言处理模型优化事故严重度分类建模流程，彰显了其工程应用价值。本文为结合语言处理能力与传统数据改进事故严重度分类机器学习流水线提供了重要洞见。