The increasing emphasis on green technologies has focused more attention on low power design. Microcontroller vendors are responding by increasing their offerings of ultra low power devices that consume as little as 350 uA/MHz and have sub-uA sleep modes.
Using a low power microcontroller is a good first step in lowering power consumption. The next step should be to minimize the number of instructions the CPU must execute to get the job done, so it can spend as much time as possible in the deepest sleep mode available. This can be accomplished with careful design of the application software structure, tight coding and a compiler that reduces the number of required instructions. Compilers can have a significant effect on the number of instructions required to execute the application and the resulting power consumption. All too often the method of compilation wastes CPU cycles on interrupt routines, and on locating addresses in devices with banked memory.
Interrupt should always be small and fast. This is especially true when speed and/or power consumption are critical. Keeping interrupt routines small reduces interrupt overhead.
The compiler's contribution to interrupt is in the way it generates the context – the number of registers it saves in response to an interrupt. Ordinary compilers save every register that might be used by an interrupt because they have no way of knowing which registers will or will not be used by a given interrupt. The problem is that the number of cycles used is a direct function of the number of registers that are saved and restored. Cycles spent saving and restoring the context consume power. For example, one compiler for Microchip's PIC16 always saves 8 Bytes of data for every interrupt using a total of 42 instruction cycles (23 for the context save and 19 for the restore). This may not seem like much, but in an interrupt intensive application, the CPU could spend thousands of extra cycles "awake," unnecessarily consuming power.
Newer compilers are now available with "omniscient code generation" (OCG) technology that has the intelligence to save only those registers that are required for each particular interrupt. Omniscient code generation works by collecting comprehensive data on register, stack, pointer, object and variable declarations from all program modules before compiling the code. An OCG compiler combines all the program modules into one large program, which it loads into a call graph structure. Based on the call graph, the OCG code generator creates a pointer reference graph that shows each instance of a variable having its address taken, plus each instance of an assignment of a pointer value to a pointer (either directly, via function return, function parameter passing, or indirectly via another pointer). It then identifies all objects that can possibly be referenced by each pointer. This information is used to determine exactly how much memory space each pointer will be required to access.
Since an OCG compiler knows exactly which functions call, and are called by, other functions, which variables and registers are required, and which pointers are pointing to which memory banks, it also knows exactly which registers will be used for every interrupt in the program. It can generate code accordingly, minimizing both the code size and the cycles required to save and restore the context.
The smallest context will require 17 cycles - 10 to save the context and 7 to restore it. The worst case for the OCG compiler is only 25 cycles. Compared to a conventional compiler, an OCG compiler can reduce the number of interrupt- related instruction cycles by 40% to 60%.
Depending on the application, the cycle savings can be substantial. An interrupt driven serial communication port with a baud rate of 480,600bps generates 24,000 interrupts per second. Using a conventional compiler with 42 instruction cycles per interrupt (168 clock cycles per interrupt) saving and restoring the context will use up over 4,032,000 CPU cycles per second or 20% of the available cycles on a 20 MHz PIC16. An OCG compiler, averaging 21 instruction cycles per interrupt (84 clock cycles per interrupt), can reduce that number to only 2,016,000 cycles — saving ½ the clock cycles otherwise spent on saving and restoring contexts, and allowing the CPU to be put into sleep mode for 10% of its cycles. Assuming 10 mA active and about 1 uA sleep mode power consumption, an OCG compiler could reduce total MCU power consumption by nearly 1 mA – about 10%. In an application with an 8 mA power budget, that extra milliamp could be a life saver.
Banked Memory Architectures
Many 8- and 16-bit microcontrollers have banked memories that cannot be addressed simultaneously. Switching between the memory banks requires at least two bank selection instructions. Thus, if data in one bank must be written to another bank, bank selection instructions are always necessary. Obviously, placing all the variables accessed by a function in the same memory bank will reduce the number of bank selection instructions and the total required cycles for the application. However, conventional compilers have no way of knowing which functions call which variables and are unable to optimize their memory assignment. Nor do these compilers have any way of knowing whether or not a particular memory bank will be selected at any point in the code. As a result, these compilers automatically generate bank selection instructions for every memory access, whether or not that bank is already selected.
Some compilers have extensions to the C-code that identify the address of the variable. Programmers may manually assign variables to memory banks using this non-standard, non portable code. The bank qualifiers allow the compiler to see the exact bank an object resides in and reduces the number of bank selection instructions. However, this approach does not guarantee that dependent variables will be placed in the same bank. Every time a variable in one memory bank needs to be written to another memory bank, bank selection instructions will still be required. In addition, trying to track all the memory addresses across multiple code modules and ensuring that all pointers have the correct addresses is a time consuming, tedious process that can itself introduce programming errors.
In contrast, an OCG compiler knows every register, stack, pointer, object and variable declaration from all program modules, and can optimize every variable and register allocation, as well as the size and scope of every pointer and every object stored on the compiled stack. It optimizes memory allocation to minimize or eliminate bank selection instructions, without any intervention from the programmer.
By placing frequently accessed variables in unbanked memory and by placing any dependent variables in the same memory bank, an OCG compiler can radically reduce the number of cycles and power wasted on bank selection instructions in these MCU architectures. Since the OCG compiler knows which bank is selected at any point in the code, it can also eliminate any unnecessary bank selection instructions when the bank is already selected. Reducing the number of instructions reduces the number of CPU cycles by as much as 30% to 50%.
Choosing a low power device and exploiting the sleep mode capabilities of the MCU are important means of keeping power consumption to a minimum. However, the way in which the compiler manages interrupts and memory usage can also have a significant impact on power consumption. Newer compilers with omniscient code generation technology can make a substantial contribution to saving cycles and power.