Magazine Article

Algorithm-Based Fault Tolerance for Numerical Subroutines

TBMG-2421

11/1/2007

Abstract
Content

A software library implements a new methodology of detecting faults in numerical subroutines, thus enabling application programs that contain the subroutines to recover transparently from single-event upsets. The software library in question is fault-detecting middleware that is wrapped around the numerical- subroutines. Conventional serial versions (based on LAPACK and FFTW) and a parallel version (based on ScaLAPACK) exist. The source code of the application program that contains the numerical subroutines is not modified, and the middleware is transparent to the user.

Meta TagsDetails
Citation
"Algorithm-Based Fault Tolerance for Numerical Subroutines," Mobility Engineering, November 1, 2007.
Additional Details
Publisher
Published
11/1/2007
Product Code
TBMG-2421
Content Type
Magazine Article
Language
English