Diagnostic Models for Failure Analysis and Operations
- Tuesday, 01 February 2011
Diagnostic models provide significant analytical and operational benefits to improve the dependability and efficiency of NASA systems.
The Constellation Program and the Exploration Technology Development Program (ETDP) funded the development of diagnostic models using the TEAMS (Testability Engineering and Maintenance System) tool for the Ares, Orion, and Ground Operations Projects to demonstrate operational uses for ground processing and launch operations. These models were found useful not only for operational pre-launch checkout, but also for analysis of failure effects, failure detection coverage, and fault isolation effectiveness. TEAMS, a commercial model-based tool from Qualtech Systems, Inc. (East Hartford, CT), performs fault diagnostics (isolation and identification). Fault isolation means identifying the location of the fault (cause) that is compromising system functions. Fault identification means identifying the failure mode (mechanism) that is causing system failure. Diagnostics refers to both fault isolation and identification functions.
The TEAMS tool provides the capability for engineers to model a system architecture using directed graphs in which nodes represent components, and directed arcs (lines with arrows) represent the connections between components. Failure modes are modeled as the lowest-level internal elements of a component, and failure effects are modeled as functions associated with each component. The locations of the components for these functions define the failure effect propagation paths. Sensors and measurements are associated with test points, and “tests” at those points define which functions are observable. Changes to the system configuration are controlled through switch states, which represent component power, mechanical, and software switches. Failure effects are modeled with effect nodes or pseudo sensors. With TEAMS, engineers can analyze the capability of the measurement suite to detect failures and isolate faults by forward- and backward-chaining logic, and to perform operational diagnostics to determine the locations and mechanisms of failure causes. The TEAMS model so developed is a “diagnostics model.”
Effective Failure Detection
TEAMS enables a variety of useful and important failure analyses. Its ability to trace from a failure mode to all of its effects (whether sensed or not), and from a particular effect to all possible failure mode causes, is useful for a host of applications. Tracing backwards to all possible causes of a failure effect is important for caution and warning, launch commit criteria, fault trees, and probabilistic risk analyses. Conversely, tracing forward to all possible effects is necessary for the understanding of failure scenarios and all mechanisms in which these effects are observed. Combining forward and backward traces is the basis for assessment of failure detection coverage and fault isolation effectiveness. The former is needed for analysis and verification of failure mitigation mechanisms, and the latter for assessment and development of repair strategies.
The process of building and verifying the diagnostic model involves face-to-face meetings in which subsystem designers, safety and failure model analysts, systems engineers, and modelers formally trace the failure effect propagation paths through the system schematics. Doing so significantly improves the quality of the failure modes and effects analyses by providing accurate failure effects and detection mechanisms. If fault tree nodes are modeled, then the directed graph model enables formal connection of the fault trees to the Failure Modes and Effects Analyses (FMEAs) to the system architecture, providing a means to uncover gaps and overlaps. The model assists with assessment of the Time to Criticality for failures by defining the precise paths along which failure effects propagate, including the specific physics at each step. Finally, the model is delivered to operations personnel to provide systems diagnostics during ground processing and launch operations.
An architectural concept for fault detection, isolation, and recovery (FDIR) was formulated under the ETDP Integrated System Health Management Project to integrate vehicle and ground fault models, as well as other health management tools and techniques. This FDIR architecture was tested during execution of the Ares I-X Ground Diagnostic Prototype (GDP). The Ares I-X GDP demonstrated anomaly detection (detecting unexpected events, generally different from what has previously been observed), failure detection, and fault diagnostics for the Ares I-X First Stage Thrust Vector Control, and for the associated ground hydraulics while the vehicle was in the Vehicle Assembly Building at Kennedy Space Center (KSC) and while it was on the launch pad.