AI LabelMate: A Context-Aware Annotation Agent for Reducing Semantic Fragmentation

2026-01-0261

4/7/2026

Features
Event
Authors
Abstract
Content
Robust perception systems for autonomous vehicles rely heavily on high-quality, labeled data, particularly in off-road and unstructured environments. However, the performance of the perception model is often degraded by data chaos resulting from limitations in automated segmentation. Foundation models, such as SAM2, while powerful, typically generate masks based on low-level visual cues, including color and texture gradients. In complex off-road scenes, this leads to semantic fragmentation. A single object, like a moss-covered log, can be split into not only dozens of segments for its bark and moss but also hundreds of smaller, meaningless patches based on minor color variations. This paper introduces a context-aware annotation agent to resolve this issue. Our workflow integrates a vision-language model (Florence-2) for scene understanding with a segmentation model (SAM2) for mask generation. Instead of segmenting indiscriminately, our agent leverages Florence-2 to comprehend the image holistically, localizing complete objects. For example, after Florence-2 identifies a ”moss-covered log,” its semantic context guides the generation of masks for the entire entity or meaningful sub-components, such as moss patches and bark, not just fragmented color variations. This initial mask, generated in seconds, provides annotators with an excellent starting point, significantly reducing the manual effort required for vertex-by-vertex outlining. Annotators retain complete editing control, with the ability to adjust polygon vertices for a pixel-perfect mask and features such as drawing a bounding box or sketch to automatically segment an object. This agent provides a framework that utilizes the complementary strengths of scene understanding and segmentation models. Deploying each model for its own specialized task makes it possible to make more consistent, high-quality automotive datasets faster, which speeds up the creation of safer perception systems.
Meta TagsDetails
Citation
Patil, A., Mikulski, D., Mwakalonge, J., and Jia, Y., "AI LabelMate: A Context-Aware Annotation Agent for Reducing Semantic Fragmentation," WCX SAE World Congress Experience, Detroit, Michigan, United States, April 14, 2026, https://doi.org/10.4271/2026-01-0261.
Additional Details
Publisher
Published
Apr 07
Product Code
2026-01-0261
Content Type
Technical Paper
Language
English