Visual–Language Model–Driven Annotation and Analysis of Maneuver Data for Enhanced Vehicle Safety

Features
Authors
Abstract
Content
Vehicle maneuver data are essential for perception and planning in advanced driver-assistance systems (ADAS) and automated driving systems (ADS). While high-quality annotations improve machine-learning performance, existing maneuver datasets remain fragmented, labor-intensive to annotate, and inconsistent in semantic richness. Challenges persist in scalability, interpretability, and contextual labeling. This article establishes a structured framework for maneuver data analysis by combining a systematic review of existing resources with the development of a new multimodal dataset. First, we conduct a systematic review of publicly available datasets such as HDD, KITTI, BDD-X, D2CAV, Brain4Cars, DrivingDojo, and the Driving Behavior Database. We further evaluate the data modality and sensor configurations including event data recorders, onboard logging systems, and smartphone sensing. We then propose the Matt3r Data Collection System with modern metadata management, which integrates video, GPS, and IMU signals into temporally coherent clips. Next, we outline the limitations of traditional annotation approaches, which rely on manual labeling and rule-based methods. To address the limitations of traditional manual and semi-automated labeling, we propose a Vision–Language Model (VLM)–driven annotation pipeline. VLMs generate maneuver categories and causal explanations through prompt-based reasoning, with selected outputs refined through human-in-the-loop verification. Finally, we propose an annotation quality evaluation based on accuracy, inter-annotator agreement, credibility, consistency, and efficiency gain. In summary, this article bridges the gap between the environment perception requirements of existing ADAS and ADS systems and the developing capabilities of generative artificial intelligence. By providing a novel and scalable research approach for AI-driven maneuver data annotation and analysis, this article supports data engineering efforts for both research and practical applications aimed at enhancing vehicle safety.
Meta TagsDetails
Citation
Bai, L., Yuan, C., Osman, I., Lin, Z., et al., "Visual–Language Model–Driven Annotation and Analysis of Maneuver Data for Enhanced Vehicle Safety," SAE Int. J. Trans. Safety 14(1), 2026, https://doi.org/10.4271/09-14-01-0032.
Additional Details
Publisher
Published
1 hour ago
Product Code
09-14-01-0032
Content Type
Journal Article
Language
English