Between 1985 and 1987, a radiation therapy device called the Therac-25 was involved in at least six incidents in which the device delivered massive overdoses of radiation. The patients involved suffered radiation burns and symptoms of radiation poisoning. Three of those patients eventually died, all because of a latent software bug. A race condition had gone undetected. It was a test case no one had thought to define.

Organizations with mission-critical and safety-critical needs like NASA, Lockheed Martin, and Honeywell are gradually making the shift to formal methods.

Thirty-five years have passed since the Therac-25 was brought to market in 1982. In that time, the volume and complexity of software in embedded systems has grown enormously. More and more of that software has become mission-critical and safety-critical. If embedded systems are to function effectively and safely, that software must be extremely reliable.

To meet ever-increasing reliability demands, new methodologies for specifying, designing, and coding the software in embedded systems — methods like model-based design — have evolved. Yet software verification, for the most part, has remained rooted in the same methods that were used to test the Therac-25. Test cases are still defined, and test coverage is still monitored; in other words, the procedures for verification of software have not kept pace with advances in designing and implementing it.

As the complexity of embedded systems and their reliance on software for mission-critical and safety-critical functions continue to grow, the organizations that develop these systems will eventually be forced to adopt more robust methodologies for their verification. Fortunately, recent advances have made verification techniques known as formal methods a viable alternative to traditional testing.

The use of formal methods for model-based design verification will offer systems and software engineers — and the companies they work for — a much higher level of confidence in the accuracy and robustness of the embedded systems they design and produce.

The time to begin transitioning to formal methods for model-based design verification is now. This article will explain why.

What are Formal Methods?

Formal methods verification tools find “odd” cases that testing often misses — cases that take many time steps to trigger.

In computer science, formal methods are techniques that use mathematical logic to reason about the behavior of computer programs. To apply formal methods in system verification, you (or a tool built for the purpose) must translate a system into a mathematical structure — a set of equations. You then apply logic, in the form of mathematical “rules,” to ask questions about the system and obtain answers about whether particular outcomes occur.

Formal methods go all the way back to Euclid. So, almost all of us have some experience with them from a secondary school geometry class. As you'll remember, they start with an axiom or postulate, which is taken as self-evident, and logic is used to reason toward the theorem using “rules” that had previously been proven true. If only the logical transformations allowed are applied, then the conclusion reached at the end — the theorem — must be true.

Formal methods for engineering computer systems work in much the same way. In computer science, formal methods really kicked off — on a theoretical basis — in the late 1960s and early ‘70s, when widespread use of computing was still in its infancy. Theoretical mathematicians were observing computer programming — still relatively simple at the time — and saying, “That's a mathematical structure; I can apply set theory to that.”

Tony Hoare is generally credited with introducing formal methods to computer science with his invention of Hoare logic. This and similar formal methods work much like algebra. They even make use of algebraic laws, like the associative, commutative, and distributive properties. If you apply the same operations to both sides of the equal sign, then the equation remains equal.

Let's say you want to prove that a specific output of your system never goes above a certain value. Using formal methods, you would apply your chosen set of rules to prove that your assumption — your requirement — is true. In the end, if you've applied your algorithms correctly, and if you find that indeed, your selected output never exceeds that specified value, then, as in a Euclidean proof, there is no question your theorem is true. You're absolutely certain. You've proven beyond a doubt that your system meets that requirement.

In contrast, if you were to apply a representative set of inputs to your system to test your assumption empirically, you could never really be sure your assumption was true, unless your set of test cases exercised all possible combinations of input values and stored states that affect the selected output — a daunting and exponential task in today's embedded software environment.

Formal Methods for Engineering Applications

Formal methods didn't gain much traction with industry until the 1990s. Before then, computers and computer programs were relatively simple, while formal methods were primitive and difficult to apply. Testing remained the most efficient means of system verification. Then, programming errors began getting companies into serious trouble.

Not long after the Therac-25 catastrophe, disaster struck AT&T's global long-distance phone network. On January 15, 1990, a bug in a new release of switching software caused a cascade of failures that brought down the entire network for more than nine hours. By the time the company's engineers had resolved the problem — by reloading the previous software release — AT&T had lost more than $60 million in unconnected calls. Plus, they'd suffered a severe blow to their reputation, especially amongst customers whose businesses depended on reliable long-distance service.

Four years later, a bug was discovered in the floating-point arithmetic circuitry of Intel's highly publicized Pentium processor. This error caused inaccuracies when the chip divided floating-point numbers within a specific range. Intel's initial offer — to replace the chips only for customers who could prove they needed high accuracy — met with such outrage that the company was eventually forced to recall the earliest versions. Ultimately, the Pentium FDIV bug cost Intel some $475 million.

The Therac-25, the AT&T switching control software, and the Intel Pentium chip were all tested extensively. Still, that testing failed to find the catastrophic bugs in those systems. Today, due in large part to the Pentium bug, formal methods verification is now a standard practice at Intel, and is used routinely by other manufacturers to verify IC chip designs. Yet software developers lag far behind hardware makers in the use of formal methods for embedded system verification.

This discrepancy is due primarily to the difference between IC logic and modern software logic. The logic in a CPU reduces to arrays of logic gates: ANDs, NANDs, ORs, etc. It's all Boolean. The formal methods engines used for Boolean logic, such as satisfiability solvers (SAT solvers), are now very well understood (thanks again to the Pentium bug, and to companies who picked up the ball and ran with it). Formal verification of ICs requires very fast computers, but only because the logic arrays are so vast.

Software is a whole different problem. Modern software logic is more complicated than IC logic. It requires more sophisticated mathematics. The solvers used in formal methods verification of software, known as satisfiability modulo theories SMT solvers, add mathematical constructs beyond Boolean logic.

SMT solvers have taken longer to mature; in fact, they're still evolving. For now, it is quite difficult to apply formal methods to the full source code of large-scale embedded applications. Converting large, complex source files, like a flight-control program, into formal methods language is still a daunting, arduous, and extremely time-consuming task. But that doesn't make formal methods software verification impossible.

To apply formal methods to a large software program today, you need to do one of two things. You can apply them to small portions of the program; critical parts that must work without fail, for example. Or you can apply them to an abstraction of the actual implementation. Model-based design is just such an abstraction. It simplifies the representation of the system and breaks it into interconnected blocks. This abstraction, in turn, simplifies both the task of translating the design into formal methods language, and the task of querying the system.

Recent breakthroughs, as well as complete coverage of the design, now make this second approach the preferred one for formal verification of embedded systems. But before this approach is discussed further, let's look more closely at the reasons for applying it.