JIFI (Jet Propulsion Laboratory's Implementation of a Fault Injector) is a computer program for studying the ability of a computer to tolerate, detect, and/or recover from faults (that is, bit errors). JIFI affords the capability to inject faults into user-specified central- processing-unit (CPU) registers and memory regions with uniform random distributions in location and time. This capability makes it possible to study the fault sensitivity of either a computer regarded as a complete system or of a specified component of hardware or application software. JIFI operates at the application level and is easy to use. In contrast, prior fault-injection software operates at a lower level and is more difficult to use. JIFI includes fault-injection, profiling, output- verifying, and classifying subprograms that constitute parts of an easy-to-use software interface for performing fault-injection experiments and analyzing the resulting data. JIFI generates a fault-injection-result output file for each run. Data from massive fault-injection campaigns can be collected and processed automatically.
This program was written by Anil Agrawal, Garen Khanoyan, John Beahan, Leslie Callum, Raphael Some, and Won Kim of Caltech for NASA's Jet Propulsion Laboratory. For further information, access the Technical Support Package (TSP) free online at www.nasatech.com/tsp under the Software category.
This software is available for commercial licensing. Please contact Don Hart of the California Institute of Technology at (818) 393-3425. Refer to NPO-30162.
This Brief includes a Technical Support Package (TSP).

Program Injects Random Faults for Testing Computers
(reference NPO-30162) is currently available for download from the TSP library.
Don't have an account?
Overview
The document outlines the development and application of the JPL's Software-Implemented Fault Injection (SWIFI) tool set, specifically the JIFI (Jet Propulsion Laboratory's Implementation of a Fault Injector), aimed at enhancing fault tolerance in parallel processing supercomputers for future space exploration missions. The primary goal of JIFI is to emulate the effects of radiation-induced transients on hardware components, which is crucial for validating the design and reliability of systems used in critical applications.
JIFI enables the injection of faults into user-specified CPU registers and memory regions, allowing for a uniform random distribution of faults in both location and time. This capability is essential for conducting fault-sensitivity studies on specific hardware and software components, as well as the overall system. The tool set includes various components such as fault injectors, profilers, output verifiers, and classifiers, providing an easy-to-use interface for executing extensive fault injection campaigns and analyzing the resulting data.
The document emphasizes the importance of user feedback in prioritizing potential extensions to the SWIFI tool set. Suggested enhancements include location-based injection targeting individual machine instructions, instruction-by-instruction tracing after faults are injected, Fortran bindings for the SWIFI API, higher-resolution fault rate control, and the ability to extract marked memory regions into result reports.
Additionally, the document discusses the logging and printing facilities available in the SWIFI tool set, which allow users to output text to the screen and log files with timestamped entries. Users can customize the verbosity of the output and suppress certain markers if desired.
Overall, the JIFI tool set represents a significant advancement in the field of fault tolerance, providing researchers and engineers with the necessary tools to evaluate and improve the resilience of computing systems against faults. The work is conducted under the auspices of NASA and the Jet Propulsion Laboratory, highlighting the collaboration between government and research institutions in advancing technology for space exploration. The document serves as a technical support package, detailing the methodologies and tools being developed to ensure the reliability of future space missions.

