Vision-Language Model Training for Code Generation from Requirements and Model-Based Artifacts

2026-01-0770

To be published on 06/01/2026

Authors
Abstract
Content
Recent advancements in Vision-Language Models (VLMs) have opened new possibilities for bridging the gap between Systems Engineering artifacts and automated code generation. Traditional Large Language Models (LLMs) are primarily trained on textual data and generic code repositories, which limits their ability to interpret graphical engineering artifacts such as Simulink block diagrams or system architecture models. In safety-critical domains like the automotive industry, these graphical models are central to development workflows and must remain closely aligned with textual requirements and implementation code to ensure traceability, compliance, and functional correctness. This paper proposes a VLM-centered multimodal training framework for code generation that integrates textual requirements, graphical model-based artifacts, and annotated source code into a unified learning process. By leveraging models which combine vision encoders with language backbones, the approach enables the model to jointly learn the structural semantics of engineering diagrams and the linguistic and syntactic patterns of requirements and code. This alignment allows the model to generate code that is not only syntactically correct but also semantically consistent with both textual specifications and graphical designs. We evaluate the approach on a representative automotive dataset consisting of requirements, Simulink block diagrams, and C/C++ implementations. Preliminary results demonstrate that incorporating visual model representations significantly improves code correctness, requirement alignment, and structural consistency compared to text-only baselines. These findings highlight the potential of Vision-Language Models to enable more accurate, adaptive, and domain-compliant code generation, paving the way for the integration of VLMs into future model-based software development workflows.
Meta TagsDetails
Citation
Padubrin, M., Kulzer, A., and Guerocak, E., "Vision-Language Model Training for Code Generation from Requirements and Model-Based Artifacts," 2026 Stuttgart International Symposium, Stuttgart, Germany, July 8, 2026, .
Additional Details
Publisher
Published
To be published on Jun 1, 2026
Product Code
2026-01-0770
Content Type
Technical Paper
Language
English