Boosting Algorithms for the Accident Severity
                    Classification

Islam Babaev; Igor Mozolin; Divya Garikapati

doi:10.4271/12-08-04-0030

Features

Authors

Islam Babaev

Woven by Toyota, USA

Igor Mozolin

Woven by Toyota, Japan

Divya Garikapati

Woven by Toyota, USA

Abstract

Content: Background: Road accident severity estimation is a critical aspect of road safety analysis and traffic management. Accurate severity estimation contributes to the formulation of effective road safety policies. Knowledge of the potential consequences of certain behaviors or conditions can contribute to safer driving practices. Identifying patterns of high-severity accidents allows for targeted improvements in terms of overall road safety. Objective: This study focuses on analyzing road accidents by utilizing real data, i.e., US road accidents open database called “CRSS.” It employs advanced machine learning models such as boosting algorithms such as LGBM, XGBoost, and CatBoost to predict accident severity classification based on various parameters. The study also aims to contribute to road safety by providing predictive insights for stakeholders, functional safety engineering community, and policymakers using KABCO classification systems. The article includes sections covering theoretical methodology, data analysis, model development, evaluation, performance metrics, and implications for improving road safety measures by comparing the performance of different boosting algorithms on the CRSS dataset. This study aims to identify the most effective machine learning algorithm to integrate into our product line in the near future, enabling accurate prediction of both accident severity and occurrence. Results and Conclusions: This study addresses challenges in evaluating performance metrics for different severity classes within unbalanced datasets, emphasizing the impact of dominant classes like Class O (O = no apparent injury) on overall accuracy. The investigation reveals the limitations and conservatism associated with imbalanced data in boosting models, hinting at a potential ceiling in their performance around 80%. Comparative analysis of algorithms, including CatBoost, XGBoost, and LGBM, demonstrates comparable performance even in the case of applying KNN algorithm for pre-processing, based on various metrics, especially accuracy, F₁-score, ROC-AUC, and PR-AUC for all severity classes. XGBoost with KNN algorithm did not show any significant performance improvement compared to the XGBoost without KNN algorithm. The study includes performance metrics, such as F₁-score, CM upper triangle, ROC-AUC, and PR-AUC applied to an accident analysis case study. Future work directions involve extending the application of CatBoost, XGBoost, and other algorithms to diverse datasets, exploring the capabilities of deep neural networks, refining dataset preparation for accuracy improvement, and creating unified tools for hazard analysis and risk assessment.

Meta Tags

Topics: Accident reconstruction
Neural networks
Machine learning
Risk assessments
Crashes
Mathematical models
Traffic management
Research and development
Crash prevention

Affiliated or Co-Author: Woven by Toyota, USA
Woven by Toyota, Japan

Details

DOI: https://doi.org/10.4271/12-08-04-0030

Pages: 17

Citation: Babaev, I., Mozolin, I., and Garikapati, D., "Boosting Algorithms for the Accident Severity Classification," SAE Int. J. CAV 8(4), 2025, https://doi.org/10.4271/12-08-04-0030.

Additional Details

Publisher: SAE International

Published: Oct 17, 2024

Product Code: 12-08-04-0030

Content Type: Journal Article

Language: English

SAE International Journal of Connected and Automated Vehicles

Volume 8, Issue 4