As early as 25 years ago, industrial system integrators saw the great potential that the Windows operating system brought to PCs. They saw the possibility of using the advanced graphic capabilities that Windows offered versus the relatively primitive human interfaces of DOS-based applications and those of other proprietary OSes. Windows enabled the development of controllers with advanced human-ma chine interfaces (HMIs) that provide a whole new level of functionality, and make machines easier to use and maintain.

The problem with Windows, however, is that it isn’t deterministic. Factory automation applications typically involve motion control systems, which rely on timely reading of hardware position sensors to provide feedback on the position of motion axes. The Windows OS, however, is not designed to respond to outside stimulus in a predictable amount of time, and therefore, by itself, cannot be used to control applications involving multiple rapid events that have to occur at specific times. So most early industrial PCs were limited to being used for operator interfaces or were interfaced to a second computer that ran a real-time operating system (RTOS). In due time, multi-workload environments enabled Windows and a RTOS to run on the same system. The result: a single system running both environments.
Recently, industrial PCs have begun to be fitted with multicore processors that offer amazing processing capacity and many enhanced computing features that enable them to perform functions that were once exclusive to application-specific processors such as digital signal processors (DSPs). OEMs that desire to decrease system costs are looking to leverage multi-cores to consolidate control functions that have been implemented on several separate pieces of computing hardware. But adding more real-time tasks, plus human interface tasks, and distributing those among multiple processor cores while preserving the real-time responsiveness of the overall system, is a challenging task.
Fundamentals of Control Applications
To better understand the challenge of running several real-time functional blocks such as motion control operations on the same computing platform, it is useful to go over the fundamentals of what a real-time control system consists of from a software perspective. Machines are controlled by control loops. A control system sends out a command to a motion system on the machine and then periodically samples the result, making corrections until such time as execution of the command is complete. The faster and more complex the action is, the faster the periodic sampling and correction (called the control loop) must be.
From a software point of view, control loops as run by a RTOS consist of a high-priority sampling thread that is triggered by an event like an internal clock, which interrupts background processing by the computer. This thread reads data from machine sensors and re-enables interrupts when complete. The RTOS then goes off to run the next-highest-priority thread. Frequently, the task that is resumed is the thread where acquired data is processed and a result used to make correction to the system as required. Or, the task that is resumed could be updating a Windows HMI application running alongside the RTOS control loop. If nothing needs to be done, the processor may merely stay in an idle state until such time as the control loop starts again.
From a graphical point of view, an application that performs periodic monitoring of an event can be depicted as a loop as shown in Figure 1. Increasing the processing capacity without decreasing the loop-time leaves more processor idle time, as shown in Figure 1 (A →A-1), while speeding up the loop decreases the overall time that the loop takes to complete, as well as reduces the idle time (A →A-2 and A-1 →A-3). Reducing loop times is sometimes necessary when more precise control is required.

It gets interesting when one tries to integrate two time-critical workloads such as two independent control loops on the same processor, as described in Figure 2. For example, consider a machine that performs motion control and interfaces to remote motor drives via an Ethernet-based control bus such as EtherCat or Profinet. Both functions, the motion control and the Ethernet-based control bus, must be serviced at specific time intervals that are typically asynchronous to each other. In situations such as this, certain things need to be understood and accounted for, including:
- The data acquisition typically needs to happen at a prescribed time or at least within the loop time. This means that the sampling threads must have the highest priority over any other threads. Also, the fastest control loop will probably be given highest priority over the slower one. So, as shown in Figure 3, the priorities might be: sampling thread of Loop A, sampling thread of Loop B, processing thread of Loop A, processing thread of Loop B, and then any other threads. In the diagram, because Loop A has to acquire its data at a certain time, it has to interrupt Loop B during its data processing cycle to acquire its data and then return to Loop B until it has completed its data processing, before Loop A data processing can start.
- Every time a thread is interrupted, context switching takes place (context switching is the term used to apply to the saving of all information that is critical to the previous state of the machine so that processing of the interrupted task can be resumed after the interrupt is handled). This burns precious time, especially in cases where several fast control loops are running at the same time. It’s costly even in the above example where only two control loops are running on the same processor.
- Since both loops are running asynchronously to each other, it is possible for their sampling threads to coincide. When this happens, because Loop A’s sampling thread is higher priority than Loop B’s, handling Loop A’s interrupt will take precedence. Good programming practice requires that interrupts be re-enabled as soon as the data has been read by the sampling thread. Any delay will affect the data acquisition performed by Loop B’s sampling thread, which is initiated by the clock event that causes the Loop B interrupt.
Increasing processing power will shorten the processing time, increase the idle time, and therefore add the capability to handle more tasks or run a faster loop. The loops, however, will still interrupt each other from time to time.
Latency and Jitter

Latency (the time between initiating a request for data and receiving a response) is generally manageable when a fast enough processor is used and the data is available at a predictable instant. What creates problems in control systems is when latency is inconsistent and it leads to jitter. Jitter, which is manifested in erratic data readings that are caused by a system that doesn’t maintain a uniform sampling period, can lead to system instabilities, which in turn can result in poor-quality products being produced by a machine, maintenance problems, or even safety risks.
In a control loop, there are fundamentally two kinds of jitter that need to be taken into account: sampling jitter (data acquisition) and control jitter (control correction). In the example shown in Figure 2, where two control loops are run on the same processor, loop A’s control correction is delayed significantly, almost doubling in time from tc1 to tc2. In addition, because the control loops are independent/ asynchronous, the interruption can occur at different times from cycle-to-cycle and therefore, manifest itself as jitter. This could have been averted if Loop A’s processing thread was given higher priority than Loop B’s interrupt thread, but the sampling time of Loop B would have been delayed significantly. The viability of that tradeoff can only be determined on a system-by-system basis. The bottom line is that attempting to run two or more independent control loops on the same processor may require some compromises, and in some cases, the tasks may not work together, especially if the control loops are running at higher rate. For instance, if one were to take Loop A running at a 500us cycle time, and Loop B running at 1/6 that speed, it is easy to imagine that it would be difficult for both Loop A and Loop B to run on the same processor.
Multicore to the Rescue
Moving to multicore processors opens up options. However, these require some examination because it is becoming clearer that it is not simple for an operating system (OS) to allocate available processing capacity in such a way that it will address the kind of challenges that we have discussed. Some operating systems allocate tasks among available processor cores in a manner that keeps the processors as busy as possible, without regard to the timing of individual tasks. This technique is called symmetric multi-processing (SMP). Distributing real-time I/O-intense control applications is still an elusive target for SMP schedulers. To start with, it is difficult enough for a scheduler to estimate how much processing it needs to allocate to support independent applications, but that really gets difficult when applications start interacting with each other, such as several control loops running on the same processing resource.
From the point of view of a real-time application such as a motion control system, using asymmetrical multi-processing (AMP) scheduling is the best way of distributing applications across a multicore processor. In such topology, applications or parts of applications are distributed in a fixed way across the cores with each core running independently from the others. With AMP techniques, which assign tasks to specific processors, there is always a known constant amount of processing power made available to the designated applications. Some OSes can take that a step further by allocating specific I/O and the associated interrupts to specific processor cores. This enables appropriate applications and I/Os to be combined in such a way that only the application that is meant to service the I/O is interrupted, maximizing the performance of a control loop and keeping applications from interfering with each other.
Getting Tasks to Talk
Having optimized performance and ensured the integrity of the various components by partitioning and isolating components of an application on multiple processor cores, the next step is to make the components work as a whole. One way of doing this is to include some form of inter-process communication (IPC) for the task component to communicate at the process level. When implemented properly, the components should work together as if they were all running on a single processor. Additionally, the ideal IPC mechanism should be designed in such a way that using it does not affect the way the code works, resulting in the ability to have one code base supporting various processor system configurations (e.g., single, dual, or quad core, with and without hyperthreading), depending on the performance requirements of the particular application (see Figure 3).
Industrial PCs have come a long way since the early days of single-task applications, and economies of scale and a huge software development industry promise to keep the PC architecture in place as the most popular platform for industrial computing for the foreseeable future. To take advantage of the power of the latest multicore processor technologies in applications involving highspeed and/or high-precision functions such as motion control, highly specialized operating software supports motion control and Ethernet-based control stacks that are designed to facilitate reliable processing of time-critical tasks in a multi-processor environment.
This article was written by Chris Grujon of TenAsys Corp., Beaverton, OR. For more information, Click Here