According to research performed at Sandia National Laboratories, the current trend of increasing the speed of supercomputers by increasing the number of processor cores on individual chips may actually worsen performance for many complex applications. A Sandia team simulated key algorithms for deriving knowledge from large data sets. The simulations show a significant increase in speed going from two to four multicores, but an insignificant increase from four to eight multicores. Exceeding eight multicores causes a decrease in speed. Sixteen multicores perform barely as well as two, and after that, a steep decline is registered as more cores are added.
The problem is the lack of memory bandwidth, as well as contention between processors over the memory bus available to each processor. The memory bus is the set of wires used to carry memory addresses and data to and from the system RAM. To use a supermarket analogy, if two clerks at the same checkout counter are processing your food instead of one, the checkout process should go faster. Theoretically then, being served by four clerks, or eight clerks, or sixteen should further improve performance. The problem is, if each clerk doesn't have access to the groceries, he or she doesn't necessarily help the process. Worse, the clerks may get in each other's way. The same concept apparently applies to processor cores.
According to a simulation of high-performance computers by Sandia's Richard Murphy, Arun Rodrigues, and former student Megan Vance, the lack of immediate access to individualized memory caches - the "food" of each processor - slows the process down instead of speeding it up once the number of cores exceeds eight. "To some extent, it is pointing out the obvious," admitted Rodrigues. "Many of our applications have been memory-bandwidth-limited even on a single core. However, it is not an issue to which industry has a known solution, and the problem is often ignored."

