An Explainable Machine Learning Approach to Traffic Accident Fatality Prediction

from arxiv, 10 Pages, 6 figures, 2 tables, 28th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2024)

Road traffic accidents (RTA) pose a significant public health threat worldwide, leading to considerable loss of life and economic burdens. This is particularly acute in developing countries like Bangladesh. Building reliable models to forecast crash outcomes is crucial for implementing effective preventive measures. To aid in developing targeted safety interventions, this study presents a machine learning-based approach for classifying fatal and non-fatal road accident outcomes using data from the Dhaka metropolitan traffic crash database from 2017 to 2022. Our framework utilizes a range of machine learning classification algorithms, comprising Logistic Regression, Support Vector Machines, Naive Bayes, Random Forest, Decision Tree, Gradient Boosting, LightGBM, and Artificial Neural Network. We prioritize model interpretability by employing the SHAP (SHapley Additive exPlanations) method, which elucidates the key factors influencing accident fatality. Our results demonstrate that LightGBM outperforms other models, achieving a ROC-AUC score of 0.72. The global, local, and feature dependency analyses are conducted to acquire deeper insights into the behavior of the model. SHAP analysis reveals that casualty class, time of accident, location, vehicle type, and road type play pivotal roles in determining fatality risk. These findings offer valuable insights for policymakers and road safety practitioners in developing countries, enabling the implementation of evidence-based strategies to reduce traffic crash fatalities.

翻译：道路交通事故（RTA）在全球范围内构成重大公共卫生威胁，导致大量生命损失和经济负担。在孟加拉国等发展中国家，这一问题尤为严峻。建立可靠的模型来预测事故后果对于实施有效的预防措施至关重要。为协助制定有针对性的安全干预措施，本研究提出了一种基于机器学习的方法，利用2017年至2022年达卡都市区交通事故数据库的数据，对致死性和非致死性道路交通事故结果进行分类。我们的框架采用了一系列机器学习分类算法，包括逻辑回归、支持向量机、朴素贝叶斯、随机森林、决策树、梯度提升、LightGBM和人工神经网络。我们通过采用SHAP（SHapley Additive exPlanations）方法优先考虑模型的可解释性，该方法阐明了影响事故致死率的关键因素。我们的结果表明，LightGBM优于其他模型，其ROC-AUC得分达到0.72。研究进行了全局、局部和特征依赖性分析，以更深入地理解模型的行为。SHAP分析揭示，伤亡等级、事故时间、地点、车辆类型和道路类型在决定致死风险中起着关键作用。这些发现为发展中国家的政策制定者和道路安全从业者提供了宝贵的见解，有助于实施基于证据的策略以减少交通事故死亡人数。

相关内容

Machine Learning

关注 2249

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日