As the cooling challenges in embedded system applications have multiplied due to increased processing performance, smaller package and system footprints, and the requirement to operate in more rugged environments, new thermal management options and industry standards continue to evolve. Designers of systems for these markets, and especially those in remote or rugged, 24/7 operating environments, have always had to make difficult decisions regarding lifespan, reliability, and cost.
But what is the best thermal management cooling option for a particular application? To assist with the selection process, it is helpful to understand the four primary cooling methods used in embedded systems: onboard fans, passive cooling with system fans, conduction cooling, and fanless convection cooling. While some applications depend on industry standards that address cooling concerns in their basic specification, others benefit from a variety of component building blocks to achieve cost-effective resolution of thermal management issues.
Understanding Your Options
The thermal management method used depends entirely upon the application and the individual devices designed into the system. Applying knowledge of the application devices along with a thorough analysis of the thermal makeup of a particular application, designers can more easily match the optimal cooling option. Below is an overview of the main electronics cooling options. Examples of typical usage of each cooling option are presented in the accompanying table.
Onboard Fans: By far the most popular method of thermal management is the use of onboard or passive cooling fans somewhere in the embedded design. Fans are also one of the most cost-efficient options for system cooling.
The most efficient fan-based cooling mechanism in terms of heat dissipation has a fan directly mounted to the heatsink of a CPU board. The fan is paired with the CPU and heatsink to provide optimum cost-effective cooling. How well a fan-based solution works depends on its design and construction. There are new fans available that provide significant enhancements in blade and bearing design, improving airflow while reducing noise and vibration. The thermal limit in terms of ambient air temperature is typically provided by the supplier of such a board.
While very efficient, there are some reliability issues with this type of cooling. First, the CPU fans tend to be small high RPM fans, which are failure-prone in 24/7 embedded applications. Second, for applications with more than one board that requires cooling, there will be multiple fans, further reducing the overall reliability of the system. Still, for small embedded systems that dissipate a lot of power and/or cannot afford the space required by a larger fan, onboard fans may be the only acceptable option.
Passive Cooling With System Fans: In systems with reliability concerns or high mean time between failure (MTBF) requirements, a more efficient thermal solution may be to use one or more larger fans blowing air across all boards in the system. Each board would still have a heatsink, but instead of a dedicated fan providing airflow for the cooling of the board, the airflow across the board is dependent on the system fans.
The effectiveness of this type of cooling is dependent on the mass airflow over the board. In other words, the amount of heat dissipation is directly related to the number of air molecules coming in contact with the heatsink. While this is also true for systems with onboard fans, it is a more important consideration for passively cooled systems with central fans because the configuration of the system can greatly affect the amount of airflow a given board realizes. Thus, the challenge for the system designer is to make sure that the airflow is balanced throughout the system so each slot is able to have a similar amount of airflow.
To make the analysis process easier, board vendors typically provide graphs showing the amount of airflow required to sufficiently cool the board at a given air temperature. In fact, the specifications for some form factors, such as MicroTCA, explicitly require that spec-compliant boards include the temperature versus airflow curves in their documentation. With the temperature and airflow information in hand, the designer can determine the thermal limits of the system, assuming the amount of airflow provided to each slide can be measured or calculated.
However, in recent years with the increase of smaller systems with more rugged specifications, fans have become an issue due to limited power budgets and MTBF requirements. Even with advances in fan design and construction, this long-standing thermal management methodology can be a contributing factor for one of the highest points of mechanical failure in any system.
Because fans are mechanical devices, they are prone to mechanical wear and contribute to system vibration. Over time, fans can slowly degrade or fail completely, severely affecting the thermal health of a system. In addition to mechanical failures, fan use increases power consumption and adds noise to the system. Furthermore, there are systems in which it is not feasible to use fans for thermal management such as in industrial control and transportation systems that have sealed or space-constrained requirements. For this reason, onboard and passive cooling fans are now mostly used for desktop computing and embedded systems in environmentally controlled areas.
Conduction Cooling: Despite the cooling advantages of fans, they are impractical in some applications. For example, in an application where the system is in an unpressurized airborne application, the amount of air available may be insufficient for cooling. Similarly, the MTBF requirements of some applications, such as those in remote locations that are difficult to reach for service, may dictate a solution with no moving parts. In such cases, the preferred cooling mechanism is conduction cooling. In conduction cooling systems, heat is dissipated by transferring it from the heat-producing elements on the boards to the external wall(s) of the system. This requires a completely different cooling mechanism where a metal conduction plate is used in place of a traditional heatsink. The conduction plate, in turn, is securely connected to the chassis using wedge locks. Wedge locks expand when tightened, pressing the opposite side of the board tightly against the chassis. With the wedge locks engaged, the primary cooling path is from the heat-producing components to the conduction plate, then to the chassis.
Since heat transfer by conduction is more efficient than transfer by convection, higher power boards can be effectively cooled via conduction cooling. The wedge locks also securely fasten the board in the chassis, making it more resistant to shock and vibration.
However, the downside of conduction cooling is that it is much more expensive than an equivalent convection-cooled board. Also, it requires a special custom (and probably expensive) chassis designed to accept conduction-cooled boards and their wedge locks.
Fanless Convection Cooling: While fans in some cases may be undesirable, the cost of a customized conduction-cooled solution may be prohibitive. If the computing performance requirements are somewhat modest, an alternative is fanless convection cooling. While commonly called “fanless convection” or “natural convection,” this is somewhat of a misnomer. The heatsink in a fanless application allows convection cooling via the natural airflow (hot air rises), but it also provides for thermal dissipation via radiation.
Like conduction-cooling, this cooling mechanism allows for a fanless system. However, because “natural convection” and radiation are far less efficient cooling mechanisms, the amount of power that can be dissipated is far less. As a rule of thumb, natural convection has a practical limit of about 15 watts of heat dissipation for a 6U form factor such as VME, or about 12W for a 3U form factor such as CompactPCI. By comparison, conduction-cooled boards can consume 70W or more.
Similarly, as with all cooling methods, software applications are available to model a natural convection cooled system and help designers determine a more exact power dissipation threshold. The 12 to15 watt rule of thumb is a typical range for applications with less than 60°C ambient air temperature and embedded CPU junction temperatures of approximately 105°C.
In the past, the 12 to 15 watt threshold severely limited the applicability of natural convection for microcontroller applications and some low-power PowerPC applications. However, with the advent of Intel’s Atom N270 CPU, the capability now exists for a full single board computer with acceptable performance suitable for general purpose computing applications. This architecture allows all high speed peripherals to run at full bandwidth, including Gigabit Ethernet, SATA, PCI Express, etc.
Thermal management continues to be an important design consideration in virtually every embedded system application. The good news is that there are many viable cooling solutions, but choosing the right one depends upon numerous factors and requirements that include the type of system enclosure, noise specifications, system longevity, price/performance goals, and rugged or remote deployment. Helping the decision-making process for optimal thermal design relies on good thermal modeling and verification by actual measurements in the application. Using accurate temperature predictions enables designers to refine and select the right thermal management solution for the job.