Introducing machine learning (ML) into safety-critical systems
presents a fundamental challenge, as traditional safety analysis
techniques often struggle to capture the dynamic, data-driven, and
non-deterministic behavior of learning-enabled components. To
address this gap, the Machine Learning Failure Mode and Effects
Analysis (ML FMEA) methodology was developed as an open-source
framework tailored to ML-specific risks. This paper reports on the
maturation of ML FMEA from an initial conceptual framework to a
proven, practice-driven methodology.
We make four primary contributions. First, we extend the ML FMEA pipeline with two new
stages: a “Step Zero” for problem definition and system-level hazard analysis, and a “Step 5” for constructing ground truth or reward signals. Autonomous vehicle and humanoid robot applications are presented to illustrate the practical application and safety benefits of
these additions. Second, we introduce tailored Severity, Occurrence, and Detection criteria for ML risk assessment, resolving ambiguities encountered when applying traditional FMEA metrics to ML development processes. Third, we demonstrate systematic alignment
between ML FMEA artifacts and requirements from ISO/PAS 8800, ISO 21448 (SOTIF), ISO/TS 5083, ISO/IEC TR 5469, and UL 4600, providing a bridge between ML development practices and safety certification expectations. Fourth, we present cross-industry perspectives spanning automotive, aerospace, industrial robotics, and
defense, highlighting deployment pathways and best practices for
domain-specific adaptation. Through open-source collaboration and
cross-industry validation, the ML FMEA has matured into a practical
toolset that enables safety-informed ML workflows, supporting
auditable, repeatable, and risk-aware development of learning-enabled
systems.