Who's Who at NASA
NTB: It provides traceability. At least when you take an exception to one of the rules, you know where the deviation is and if there’s a problem, you can track it back, correct?

Holzmann: Exactly right! And you can read the justification for the exception as well. It’s not just a hidden thought process; it’s now explicit.

So, over the next two years we encapsulated these rules in a slightly broader standard that was formally adopted as an Institutional Coding Standard for the Development of Flight Software at JPL. We now have a compact, single standard that covers all of our projects and missions, and we have the tools to check compliance with that standard. The MSL mission is the first large project to enforce the standard, and I’m convinced that it will help make that code more robust.

NTB: Some of the rules just seem like common sense, so why do you think the article generated so much interest when it was published?

Holzmann: Well, people do get excited about rules, especially if you want to restrict what a developer can do. If you are a developer, of course, you want no part of that. So I was actually surprised that there was such a positive response to the rules. Of course, the fact that we chose ten rules when there are other sets of ten rules that are very popular, maybe that sparked people’s interest as well.

NTB: Some people might argue that you’re trying to do the impossible – take very complex computer code written by humans and make it 100 percent error free. How do you respond to that?

Holzmann: That’s a very good point. That’s actually not what we’re trying to do. Writing software requires human ingenuity, and human beings are not perfect; no matter how careful and precise we try to be, we do occasionally make mistakes. We can certainly reduce the number of uncaught mistakes in software by a very wide margin, and we’re working hard to make that happen, but I don’t think it is reasonable to expect 100% perfection 100% of the time. An important part of our strategy is, therefore, to develop stronger fault containment techniques for flight software. When one software module gets the hiccups, for instance, there is no need for another software module to sneeze. More to the point, if one module experiences a fault, other modules that perform independent functions should be able to continue safely while the misbehaving module is restarted or replaced.

This is not how most of our current flight systems work. When one module faults, typically the entire system reboots abruptly. That can be fine on the surface of Mars when nothing much exciting is happening, but it would be unsurvivable during mission-critical events such as launch, orbit-insertion, or in the entry-descent and landing phase. We have to learn to develop software systems with stronger firewalls between modules. The trick in building reliable systems is of course not in producing perfection. One doesn’t build a good skyscraper by using only perfect beams. Reliability is a system property, and not the property of any one component of a system.

We have to assume that every component of a system can break, so good system design means building reliable software systems from potentially unreliable software parts – just like we already know how to do by using redundancy in spacecraft hardware. Now, what works in hardware does not easily translate into software. You don’t make a software system more reliable, for instance, by running the same identical software on two computers in parallel, so part of our research agenda here at JPL is to come up with software architectures that allow individual components to fail without bringing down the whole system. Those are fault containment techniques that allow individual modules, if they fail, to be replaced or restarted without bringing down the whole system.

NTB: One of your other interests in life is photography. Back in 1988, years before digital photography caught on, you authored a book called “Beyond Photography – The Digital Darkroom,” published by Prentice Hall. In addition to coining the term “digital darkroom,” the book also boldly predicted the invention and development of digital cameras and the resulting demise of film cameras. How did you see that coming?

Holzmann: That book was based on some fun work I had done on developing digital darkroom software at Bell Labs starting in 1984. When I saw some of my colleagues at Bell Labs who were into graphics working with synthesized images on a computer, I got this idea. Being a photography enthusiast, my idea was to use digitized photos and try to replicate anything you can do in the darkroom with a computer.

« Start Prev 1 2 3 4 5 Next End»

The U.S. Government does not endorse any commercial product, process, or activity identified on this web site.