Software for Fault-Tolerant Matrix Multiplication
TBMG-1749
02/01/2004
- Content
Formal Linear Algebra Recovery Environment is a computer program for high-performance, fault-tolerant matrix multiplication. The program is based on an extension of the prior theory and practice of fault-tolerant matrix·matrix multiplication of the form C = AB. This extension provides low-overhead methods for detecting errors, not only in C, but also in A and/or B. These methods enable the detection of all errors as long as, in a given case, only one entry in A, B, or C is corrupted. The program also provides for following a low-overhead roll-back approach to correct errors once detected. Results of computational experiments have demonstrated that the methods implemented in this program work well in practice while imposing an acceptably low level of overhead, relative to high-performance matrix-multiplication methods that do not afford fault tolerance.
- Citation
- "Software for Fault-Tolerant Matrix Multiplication," Mobility Engineering, February 1, 2004.