A method of designing fault-tolerant networks of computers and other electronic circuits has been conceived with a view toward minimizing costs by utilizing commercial off-the-shelf (COTS) products and standards for all system and component interfaces. The method involves, more specifically, utilization of selected features of the 1394 bus architecture and of the stack tree-topology (see Figure 1), which is a special case of the general tree topology and which complies with the 1394 standard. Of particular significance in the method is a specific type of stack tree denoted as Complete Stack Tree (CST) of n stem nodes.
Taken by itself, the stack-tree topology is not fault tolerant: failure of any link partitions a stack tree into two segments, while failure of a stem node can partition the tree into two or three segments, depending on the specific design. Moreover, the 1394 standard does not permit loops, which would be formed if, for example, one were to connect the leaf nodes of a CST with spare links. However, the 1394 standard provides a "port disable" feature, which can be utilized to make any spare links "invisible" to the rest of the network. In the initial configuration of the network, the ports connected to the spare links are disabled, thereby disabling the spare links and preventing the formation of loops. In the event of failure of one or more nodes or a link of the initial network configuration, messages can be rerouted around the failed parts by enabling the appropriate ports to activate the appropriate spare links.
The upper part of Figure 2 schematically depicts a complete stack tree without spare links [denoted a "simplex complete stack tree" (CSTS)] and a complete stack tree with a spare link constructed to obtain a configuration called "CSTR" (where "R" refers to the fact that the resulting network topology is ringlike). The lower part of Figure 2 presents an example of calculated bus network reliabilities as functions of the node failure rate to demonstrate that a significant increase in reliability is expected to be achievable by the present method.
This work was done by Leon Alkali, Savio Chau, and Ann Tai of Caltech for NASA's Jet Propulsion Laboratory. For further information, access the Technical Support Package (TSP) free on-line at www.nasatech.com/tsp under the Electronic Components and Systems category.
In accordance with Public Law 96-517, the contractor has elected to retain title to this invention. Inquiries concerning rights for its commercial use should be addressed to
Technology Reporting Office
JPL
Mail Stop 122-116
4800 Oak Grove Drive
Pasadena, CA 91109
(818) 354-2240
Refer to NPO-20817, volume and number of this NASA Tech Briefs issue, and the page number.
This Brief includes a Technical Support Package (TSP).

Networks Based on Stack-Tree Topology and a 1394 Bus
(reference NPO-20817) is currently available for download from the TSP library.
Don't have an account?
Overview
The document discusses a method for designing fault-tolerant networks of computers and electronic circuits, particularly for deep-space missions, by utilizing commercial off-the-shelf (COTS) products and standards. The primary focus is on achieving high reliability and reduced costs in system development by employing COTS interfaces, which facilitate the integration of various components into the system architecture.
The authors, Leon Alkalai, Savio N. Chau, and Ann T. Tai, highlight the challenges of creating long-term survivable systems under stringent power and mass constraints. They propose a solution that exploits standard features of COTS products to mitigate their inherent shortcomings, even though these features were not originally designed for mission-critical applications. The paper presents a qualitative and quantitative analysis of a fault-tolerant COTS-based bus architecture compliant with the IEEE 1394 standard.
A key innovation discussed is the "stack-tree topology," which allows for the implementation of a fault-tolerant bus architecture without requiring node redundancy. This design not only complies with IEEE 1394 but also demonstrates significant reliability improvements through quantitative evaluations. The authors emphasize that this approach can lead to performance enhancements of up to two orders of magnitude compared to traditional avionics buses, while also significantly lowering the costs associated with developing reliable systems for deep-space exploration.
The document also addresses the lack of commercial application for the technology at the time of writing, indicating that while the technology has not been built or used for its intended commercial purpose, it has been disclosed to referees and the program committee of a conference. The authors express their belief that NASA Tech Briefs would be an appropriate forum for disseminating this research, as it could provide valuable insights to U.S. industry representatives interested in the technology.
In summary, the document outlines a novel approach to creating cost-effective, reliable fault-tolerant networks for space applications by leveraging COTS products and standards. It presents a detailed analysis of the proposed architecture, its advantages, and the potential impact on future deep-space missions, while also acknowledging the current status of the technology in terms of commercial viability.

