There are many implementation scenarios combining the hardware interface and intensive OSI TCP/IP software protocol. A logical first step is using the CPU to drive the OSI stack. Today, computers come in single- and multi-core architectures. In a single-core computer, the logic to execute the TCP/IP stack may be sitting in an executable part of that computer. Merely sitting there will occupy some memory resources, and actually consumes a lot of CPU resources to drive this stack. The faster the stack needs to execute, the less time is available to execute the application in that computer. When a multi-core CPU is used, one of the simplest things to do is to move that OSI TCP/IP stack software to a core all by itself. In this way, the dedicated core will execute the communications protocol on its own, leaving the additional cores to do any other work that needs to be done at that node.
Increasing the performance of an existing computer system architecture is another popular option. Increasing CPU memory is always a good start. Another popular option is to use is the OSI enhanced performance architecture.
Use of a silicon stack to offload work is another way of increasing performance. A silicon stack is basically an auxiliary CPU; its sole purpose is to process communications. Silicon stacks provide additional capabilities such as: IPV4/IPV6, iWARP RDNA, iSCSI, FCoE, TCP DDP, and full TCP offload. The elegance of the silicon stack is that the entire OSI TCP/IP Stack Plus More can be implemented without impacting application logic performance.
Silicon Stack Performance
The software stack is limited to roughly 40 MB per second, whereas the silicon stack can sustain 250 MBps on 1 GbE, and 2500 MBps on 10 GbE. The host CPU overhead for the silicon stack implementation is extremely low, as the silicon stack in essence is a parallel engine to the CPU; the host CPU overhead for the software stack, on the other hand, is comparatively high since the software stack competes for CPU resources at an increasing level as the communication speed increases. Latency is the time it takes for the transmission to start after all of the parameters for the transmission have been preconfigured; here too, it is obvious that the silicon stack exceeds in performance over the software stack as CPU resources are used only minimally for the silicon stack implementation. Determinism is the variation on the latency for sending and receiving transmission packets. Again, the silicon stack wins due to its limited CPU resource impact. As for reliability under load, the silicon stack experiences no noticeable change in performance while the software stack will be impacted as resources are shared with any executing applications.
TCP Offload Engine
How is a TCP offload engine (TOE) integrated into an embedded system? Many single-board computers (SBC) today have XMC sites that can be used to plug in an XMC form-factored TCP offload engine that can potentially support up to four 10Gb Ethernet ports. The SBC will most likely have one or more 1Gb Ethernet ports as well, and those will be driven by the software stack executing on the SBC itself. In the context of the overall system, the TCP offload engine can provide up to four extremely high-speed ports executing in parallel with the 1Gb Ethernet port(s) on the SBC itself (see Figure 2).
Looking at this in block diagram form (see Figure 3), one can see the single-board computer (SBC) and the bus on the SBC. The SBC can process packet information to do all kinds of wonderful functions. Then there is the TOE that is connected by an XMC connector back to the SBC. The TOE card can be used as a switch, and information can be routed from one network to another. It could be routed from one 10Gb port though the SBC for some packet work and modification there, and then shipped off across the bus. It could possibly be used with information coming in for processing and then going out on the 1Gb Ethernet port. The capability provided by adding a TOE is being able to access 10Gb Ethernet ports on an SBC and not impact the processing power of the SBC.
Special FPGA Packet Processing
A process may require special information packeting or manipulating special packets. Using an FPGA in one of the XMC sites, this work could be offloaded so that not all of the processing is done on the SBC.
For instance, images are being captured and coming in on Gigabit Ethernet. The user may have to take two images and overlay them. This overlay could be done right within the FPGA, then sent back to the CPU if there is any additional processing, and then back across the bus to some other location.
The biggest trend today is sensors. There are more options and better-quality sensors for use in almost every embedded application. They are being designed to provide better situational awareness. To obtain a better understanding of what is going on in the world, it is necessary to actually gather and analyze more data. More data will spill over into increased storage requirements, and all of the systems architecture issues that go along with it, whether you are processing that information or moving it from place to place. No one wants to hear excuses that the data cannot be processed. Expectations have been amplified in terms of better results, sophisticated algorithms, and faster CPUs.
The data transport infrastructure in any system is most important. It needs to assist with data collection, shifting that data to analytical engines in a timely fashion, and getting the data to storage devices. This is often where you find many of the Gigabit and 10Gb Ethernet networking solutions making that possible. Even so, networking expectations are very high when discussing time limits for the delivery of information and the deterministic nature when it comes to the delivery of that information. Technologies are continuing to move forward and engineers concentrate on using the latest technologies. Communications technology needs to meet the requirements of today, as well making it possible to keep ahead of new specification requirements as the future may dictate.
This article was contributed by Acromag, Wixom, MI. For more information, Click Here.