A hardware unit has been designed that reduces the cost, in terms of performance and power consumption, for implementing N-modular redundancy (NMR) in a multiprocessor device. The innovation monitors transactions to memory, and calculates a form of sumcheck on-the-fly, thereby relieving the processors of calculating the sumcheck in software.
This sumcheck could be calculated using addition operations, or CRC-type (cyclic redundancy check) operations — whichever is most economical in terms of die area and power consumption. In each of the NMR systems, the sumcheck logic is initialized at the start of a task (a welldefined unit of work that will be performed by each of the NMR systems), then captured and transmitted to the vote-taker at the end of the task. The vote-taker compares the sumchecks, determines if errors have occurred, and what action, if any, should be taken to correct the errors.
The advantage over existing techniques is that minimal logic is required to implement the sumcheck unit, minimal power is consumed by the sumcheck unit when active, and the unit can have a reduced power sleep mode when inactive. Calculating a sumcheck for a task using the sumcheck unit requires no additional cycles, and so has lower latency than calculating it as a post-task in the processing unit.
This work was done by Keith Bindloss and Carl Dobbs, Sr. of Coherent Logix for Goddard Space Flight Center. GSC-16324-1