This content is not included in your SAE MOBILUS subscription, or you are not logged in.
Efficient Reliability and Safety Analysis for Mixed-Criticality Embedded Systems
ISSN: 0148-7191, e-ISSN: 2688-3627
Published April 12, 2011 by SAE International in United States
Annotation ability available
Due to the increasing integration of safety-critical functionalities into electronic devices, safety-related system design and certification have become a major challenge. Amongst others a suitable reaction of components in case of internal errors must be ensured in order to prevent a function from failing and to guarantee a certain degree of reliability. In this context a wide variety of different fault tolerance mechanisms have been developed in the past, including analytical considerations of error coverage and resulting reliability. However, most of these mechanisms induce a certain timing overhead, which in turn might affect the real-time capabilities of the system in a negative way. More concretely, even if each error is treated adequately such that no logical failure occurs, a timing failure due to missing a deadline cannot be ruled out definitely. Thus, there is a growing need for appropriate methods to calculate the probability of timing failures and to prove that potential reliability and safety constraints are not violated.
In this paper we present an analysis approach for networked systems as well as highly integrated multi-core architectures to calculate reliability with respect to timing failures. For that purpose simulation techniques are less appropriate and expensive due to the rare fault events, leading to exhaustive simulation times until results are statistically relevant. Therefore, formal methods have been developed to prove that the considered embedded real-time system is working correctly and that failure rates are bounded according to the required safety level. Further on we present an extension of the basic analysis ideas to include the influence of different error models into reliability analysis. Special emphasis is put on mixed-criticality systems, i.e. systems with applications of different safety requirements. We propose an approach to decouple the reliability analyses for these applications and to determine an individual safety integrity level for each application. Based on this approach it is possible to refine the conservative concept of IEC 61508 to take the most critical application as a basis for the whole system, enabling cost reduction and automated qualification. Based on a prototype implementation for Symtavision's SymTA/S tool suite we will show how the presented methodologies can be integrated into a safety related design flow. Based on that kind of tooling support the presented approaches can be applied for different stages of the design process, such as design space exploration and optimization as well as for verification and certification purposes.
CitationSebastian, M., Axer, P., Ernst, R., Feiertag, N. et al., "Efficient Reliability and Safety Analysis for Mixed-Criticality Embedded Systems," SAE Technical Paper 2011-01-0445, 2011, https://doi.org/10.4271/2011-01-0445.
- Borkar, S. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation IEEE Micro 25 6 10 16 2005
- Burns, A. Punnekkat, S. Strigini, L. Wright, D.R. Probabilistic scheduling guarantees for fault-tolerant real-time systems Dependable Computing for Critical Applications 7 361 378 1999
- Elliott, E. Estimates of error rates for codes on burst-noise channels Bell System Technical Journal 42 9 1977 1997 1963
- Ferreira, J. Oliveira, A. Fonseca, P. Fonseca, J. An experiment to assess bit error rate in CAN Proceedings of 3rd International Workshop of Real-Time Networks 2004
- Garcia-Frias, J. Crespo, P. Hidden Markov models for burst error characterization in indoor radio channels IEEE Transactions on Vehicular Technology 46 4 1006 1020 1997
- Kopetz, H. Real-Time Systems: Design Principles for Distributed Embedded Applications Kluwer Academic Publishers 1997
- Mukherjee, S. S. Weaver, C. Emer, J. Reinhardt, S. K. Austin, T. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor Proceedings of 36th Annual International Symposium on Microarchitecture 2003
- Navet, N. Song, Y.-Q. Simonot, F. Worst-case deadline failure probability in real-time applications distributed over controller area network Journal of Systems Architecture 46 7 607 617 2000
- Rabiner, L. Juang, B. A tutorial on hidden Markov models Proceedings of the IEEE 77 2 257 286 1989
- Rodriguez-Navas, G. Proenza, J. Clock Synchronization in CAN Distributed Embedded Systems Proceedings of 3rd International Workshop on Real-Time Networks 2004
- Sebastian, M. Ernst, R. Modelling and designing reliable on-chip communication devices in MPSoCs with real-time requirements Proceedings of 13th IEEE International Conference on Emerging Technologies and Factory Automation 2008
- Sebastian, M. Ernst, R. Reliability analysis of single bus communication with real-time requirements Proceedings of the 15th Pacific Rim International Symposium on Dependable Computing 2009
- Shooman, M. L. Reliability of Computer Systems and Networks Fault Tolerance, Analysis, and Design John Wiley & Sons 2002
- Smolens, J. C. Gold, B. T. Kim, J. Falsafi, B. Hoe, J.C. Nowatzyk, A. G. Fingerprinting: bounding soft-error detection latency and bandwidth Proceedings of the 11th international Conference on Architectural Support for Programming Languages and Operating Systems 2004
- Symtavision - Scheduling Analysis for ECUs, Buses and Networks September 2010 http://www.symtavision.com .
- Tindell, K.W. Hansson, H. Wellings, A.J. Analysing real-time communications: controller area network (CAN) Proceedings of the 15 th IEEE Real-Time Systems Symposium 1994
- Wolf, K. Blakeney, R. D. An exact evaluation of the probability of undetected error for certain shortened binary CRC codes Military Communications Conference 1988