A Unified Concept Model for Advancing Multilingual Summarization and Semantic Reasoning in the Automotive Space

2026-26-0676

01/16/2026

Authors
Abstract
Content
The automotive industry produces a vast amount of multilingual textual data ranging from technical manuals to diagnostic reports that demand efficient summarization and reliable semantic reasoning. At present, the traditional large language models (LLMs) operating at the token level struggle not only with cross-lingual understanding and domain-specific reasoning but also are prone to hallucinations, leading to inaccurate insights and responses [2, 5]. This paper introduces a Unified Concept Model (UCM) architecture for the automotive domain that processes language at the concept level using multilingual, modality-agnostic embeddings, enabling coherent cross-lingual summarization and reasoning. The UCM encodes entire sentences as semantic vectors by leveraging the SONAR embedding space, a multilingual, modality-agnostic sentence representation that supports over 200 languages. This approach to encoding facilitates a deeper understanding across language boundaries and complex technical and legal issues. An LCM-inspired concept transformer then performs reasoning over these embeddings, and a GPT-style decoder reconstructs fluent summaries or explanations in the desired language. Evaluated on diverse automotive datasets in over 20 languages, UCM outperformed token-level baselines, achieving ROUGE-L scores of 88% (+16% over LCM) and reducing hallucination rates to 4%. These results demonstrate UCM’s potential for scalable, accurate, and domain-specific AI systems in the automotive sector while enabling cross-lingual semantic reasoning beyond the capabilities of conventional LLMs. Furthermore, the paper briefly contextualizes UCM within the broader landscape of emerging AI models beyond LLMs, such as Large Knowledge Models and Large Reasoning Models, and discusses the problems and future directions for advancing concept-driven AI systems..
Meta TagsDetails
Pages
7
Citation
Singh, Samagra et al., "A Unified Concept Model for Advancing Multilingual Summarization and Semantic Reasoning in the Automotive Space," SAE Technical Paper 2026-26-0676, 2026-, .
Additional Details
Publisher
Published
Jan 16
Product Code
2026-26-0676
Content Type
Technical Paper
Language
English