A document describes a methodology for designing fault-protection (FP) software for autonomous spacecraft. The methodology embodies and extends established engineering practices in the technical discipline of Fault Detection, Diagnosis, Mitigation, and Recovery; and has been successfully implemented in the Deep Impact Spacecraft, a NASA Discovery mission. Based on established concepts of Fault Monitors and Responses, this FP methodology extends the notion of Opinion, Symptom, Alarm (aka Fault), and Response with numerous new notions, sub-notions, software constructs, and logic and timing gates. For example, Monitor generates a RawOpinion, which graduates into Opinion, categorized into no-opinion, acceptable, or unacceptable opinion. RaiseSymptom, ForceSymptom, and ClearSymptom govern the establishment and then mapping to an Alarm (aka Fault). Local Response is distinguished from FP System Response. A 1-to-n and n-to-1 mapping is established among Monitors, Symptoms, and Responses. Responses are categorized by device versus by function. Responses operate in tiers, where the early tiers attempt to resolve the Fault in a localized step-bystep fashion, relegating more system-level response to later tier(s). Recovery actions are gated by epoch recovery timing, enabling strategy, urgency, MaxRetry gate, hardware availability, hazardous versus ordinary fault, and many other priority gates. This methodology is systematic, logical, and uses multiple linked tables, parameter files, and recovery command sequences. The credibility of the FP design is proven via a fault-tree analysis “top-down” approach, and a functional fault-mode-effects-andanalysis via “bottoms-up” approach. Via this process, the mitigation and recovery strategy( s) per Fault Containment Region scope (width versus depth) the FP architecture.
This work was done by Kevin Barltrop, Jeffrey Levison, and Edwin Kan of Caltech for NASA’s Jet Propulsion Laboratory. For further information, access the Technical Support Package (TSP) free on-line at www.techbriefs.com/tsp under the Information Sciences category.
The software used in this innovation is available for commercial licensing. Please contact Karina Edmonds of the California Institute of Technology at (818) 393-2827. Refer to NPO-41344.
This Brief includes a Technical Support Package (TSP).

Methodology for Designing Fault-Protection Software
(reference NPO-41344) is currently available for download from the TSP library.
Don't have an account?
Overview
The document discusses the methodology for designing fault-protection (FP) software, particularly in the context of the Deep Impact (DI) project, developed by NASA's Jet Propulsion Laboratory (JPL). It emphasizes the importance of FP in ensuring the reliability and safety of spacecraft during missions. The FP design aims to cover all interface faults identified in Fault Containment Regions, as determined by Failure Mode and Effects Analysis (FMEA). The document highlights the balance between comprehensive FP coverage and project resource constraints, which often lead to a compromise in design complexity and cost.
The FP system for DI includes a significant number of components: 49 monitors, 921 symptoms, 667 alarms, and 39 responses. These statistics illustrate the extensive coverage provided by the FP design, which is considered minimal yet adequate by the developers. The design is influenced by previous JPL spacecraft missions, particularly the centralized FP engine approach used in the Deep Space 1 (DS-1) mission, which was both technically sound and cost-effective.
The FP software architecture is characterized by the use of formal state chart notation for defining monitors and responses, allowing for automatic code generation. This automation facilitates the integration of FP with other flight software applications, enhancing efficiency and reducing the potential for human error.
The document also discusses the challenges associated with FP design, including the added complexity that can complicate testing, verification, and operational processes. Despite these challenges, the FP design for DI is described as logical and comprehensible, benefiting from modern tools and automated testing methods. The development process involved extensive collaboration among subsystem engineers, ensuring that the FP system effectively serves as a critical component of the spacecraft's overall engineering.
In conclusion, the FP development in the DI project exemplifies a methodical approach to spacecraft design, integrating lessons learned from past missions while addressing the unique requirements of the current project. The document serves as a valuable resource for understanding the principles and practices of fault protection in aerospace applications, highlighting the ongoing evolution of FP technology in the field of system engineering.

