Software developers face constant challenges. Not only must their code be functionally correct, it must also be reliable. In addition, competition among software vendors has taken software quality to significantly higher levels. Producing high-quality code has become one of the core requirements in the software industry today as it differentiates leading software vendors from the rest of the field.

Software development in the embedded world, however, presents a unique set of challenges.

Whether they realize it or not, millions of people throughout the world use some form of an embedded system in their daily activities. Cell phones and MP3 players are good examples. Wouldn't it be nice to have a top of the line cell phone or MP3 player for half the cost? Who wouldn't love to own an 80-GB MP3 player that could slip nicely into a coat pocket for the price of a toaster?

One of the major considerations in the design of embedded systems is cost. Reducing cost becomes a major requirement when a system is produced on an ever-increasing scale. One way to reduce cost is to limit the amount of memory available within a device. A large amount of memory is considered a luxury when it comes to designing embedded systems.

With the cost issue in mind, embedded software developers are required to write efficient code, constantly adhering to code size and performance. This becomes quite a challenge for engineers developing operating systems. Scheduling complex algorithms to achieve correct functionality is a burden in itself, and as the number of tasks increases, protecting the data and functionality of one task from the other tasks becomes critically important. Needless to say, a bug in one task could eventually corrupt the entire system.

What is Virtual Memory?

As the name suggests, virtual memory (VM) creates the illusion of an unlimited amount of memory. It allows multiple processes with memory needs larger than the physical memory itself to run simultaneously on a processor. With virtual memory, a 16MB program can run on a 4MB system for example.

It's not uncommon to want to run multiple processes with sizes that are much larger than the memory available on that processor. But at any given time, only a portion of those programs are active, so main memory need not contain the entire collection of processes at all times. Processes that do not fit in physical memory are stored in a disk. Main memory can act as a "cache" containing the active portion of one program, leaving the others to reside on disk.

Figure 1. Memory hierarchy and corresponding access times.

Software developers are well aware of the memory hierarchy that consists of multiple levels of memory. The memory level closest to the CPU is the smallest and fastest. As the levels move away from the CPU they get bigger and slower (Figure 1).

The cost of each memory level varies significantly from one to the other as well. The cost per bit is a lot higher with the level closest to the CPU, and the cost decreases as you move away. One of the goals of VM is to provide the end-user with as much memory as possible at very little cost while still providing high access times. In reality, this is not always possible. Memory constraints continue to be a problem and software developers have always had to work around such issues.

VM provides the answers to both limited memory as well as memory protection, which involves safe sharing of memory among multiple programs. Different processors vary in their actual implementation of VM. Therefore, although the general concepts remain the same, we will focus on the MIPS processor's implementation when going into more detail.

Implementing Virtual Memory

The levels that virtual memory deals with are main memory (often called physical memory) and secondary storage. Information or data is passed between these two levels. The unit of information is called a block, and entire blocks are copied between levels. In virtual memory, a block is called a page.

Paging is a current approach used in all modern or general purpose operating systems to implement virtual memory. It divides the memory of a process into fixed sized pages, typically 4K. Pages are swapped in and out of main memory when needed. This essentially gives the end-user the illusion that there is an infinite amount of memory because only a portion of the memory is active at any one time. It's critical to keep track of these pages. The page table, which resides in memory, is used to accomplish this action and also acts as an index to the main memory. A page table register points to the start of the page table and is used to locate the page table in memory.

Every entry in the page table contains certain bits to help identify the pages.

The valid bit is used to keep track of whether the page is in or out of main memory. If the valid bit is off, then this page is not in main memory, and a page fault occurs. We now know that that page resides on disk and it will need to be swapped in.

Virtual memory can also have a two-level page table. The first-level page table is always present, while the second-level page table is allocated on demand. There are more bits in a second-level page table to describe its properties. A few of these properties are as follows:

  • The read and write permission bit tells the software developer whether the page is readable or writable, respectively.
  • The execution bit indicates whether the page is executable.
  • The modified bit, which is set to true, indicates that the page has been modified since the last time the bit was cleared.
Two processes within a single physical memory. In Process 1, the address space ranges from 0 to 1000 while Process 2 ranges from 1001 to 2001.

The above page table manipulations begin with the CPU providing a virtual address. The Memory Management Unit (MMU) then converts this address to a physical address which is then used to access main memory. How is this accomplished? Given a virtual address, a certain number of bits, known as the virtual page number, are used as an index to access the page table, whose corresponding entry contains the physical page number. The rest of the bits of the virtual address, known as the page offset, are appended to the physical page number to form the physical address.

In an OS, each process has its own virtual address space. A virtual address space is a set of all virtual addresses. Notice in Figure 2 that the virtual addresses are mapped onto different physical addresses, which simplifies loading the program for execution. This is known as "relocation" and is one of the advantages of virtual memory. Before the concept of virtual memory came into existence, software developers had to divide programs into overlays. Overlays were mutually exclusive pieces of programs that had to be loaded and unloaded into memory under user program control. As one would imagine, this was a huge responsibility for programmers. Each overlay consisted of its own instructions and data.

Looking at Figure 2, we also notice that virtual memory provides memory protection by providing a different address space for each process. Programs are compiled within their own address space because while they are running they could change dynamically.

Speeding Up Address Translation

Swapping pages in and out of main memory can be expensive. Every access to main memory means one memory access to obtain the physical address and a second access to get the data. To reduce this overhead, a Translation Look-aside Buffer (TLB) has been created. The TLB acts as a cache for page table entries that map to physical pages only. A TLB works on the principle of locality of reference, both temporal and spatial locality.

Given a virtual page number the memory management unit looks up the TLB to see if it exists. If the translation exists, we get a "TLB hit" and the physical page number is used to form the address. If the translation does not exist, a "TLB miss" is registered. This page might be there in memory, just not in the TLB. In this case, the translation is missing and the CPU loads the translation from the page table to the TLB. But if the translation is not in the page table, then a page fault has occurred. In this case, an exception is invoked. Therefore a page fault is an interrupt triggered by the MMU.

A page fault can be generated due to the following:

  • Page is not resident,
  • Page is invalid,
  • Permission is not enough for operation.

The page that generates the page fault will have to be made available – another page will have to be replaced in the page table. Below are a few page replacement policies that might be implemented.

  1. FIFO — First in/First Out A page that was in physical memory the longest is replaced.
  2. LRU — Least Recently Used A page that has not been accessed for the longest time is replaced. Here the principle of locality is exploited. A page that is referenced will most likely be referenced again.

Designing Virtual Memory Systems

Keeping in mind the high access times (as depicted in Figure 1) software developers must consider, there are a variety of design choices available when developing virtual memory systems. Some of these choices might include:

  • Pages should be large enough. 4KB is the standard in Pentium-X86, and 8KB in Sparc Ultra. Typical page sizes range from 4KB to 16KB.
  • Reducing page faults is one of the key design factors for virtual memory systems. Using fully associative placement of pages helps.
  • Again, looking back at Figure 1, page faults can result in high access times. Therefore, developing efficient software to handle these page faults could help reduce the miss rate. Use write-back instead of write-through policy in virtual memory.

Conclusion

We have seen that through virtual memory, software developers are relieved of several burdens. Virtual memory offers many conveniences for programmers. The end-user is also rewarded with the ability to run programs that exceed actual memory, overcoming the limitations of physical memory.

Virtual memory also offers memory protection. Each process is assigned its own address space and page table, similar to the memory protection available on Mentor Graphics' Nucleus operating system. If one process tries to access memory out of its range, an error will result. Communication of processes is achieved using shared memory. Therefore, through virtual memory, an engineer is allowed more freedom to develop complex applications.

Although the advantages of virtual memory are significant, employing this strategy can have side effects. Virtual memory can reduce the speed of the overall system because in every memory cycle, the system accesses physical memory to check for some particular information. Pages not present in physical memory will have to be swapped in. Continuous swapping of pages in and out of memory definitely reduces speed.

Even though many systems have the capability to use VM, software developers are averse to using it because of additional overhead concerns. This common misperception is changing as more software developers implement balanced VM techniques. Indeed, if more developers adopted VM strategies, the day your 3-GB cell phone device will cost less than its monthly service fee may be just around the corner.

This article was written by Larina D'Souza, Software Development Engineer, Embedded Systems Division at Mentor Graphics (Wilsonville, OR).

For more information, contact Ms. D'Souza at This email address is being protected from spambots. You need JavaScript enabled to view it., or visit http://info.hotims.com/10981-400 .


Embedded Technology Magazine

This article first appeared in the November, 2007 issue of Embedded Technology Magazine.

Read more articles from the archives here.