Faster Prototypes, Smarter Systems: mpibind Puts Electronics R&D in the Fast Lane

mpibind is a powerful, open-source tool that boosts productivity for HPC scientists by automatically and intelligently mapping applications to modern, heterogeneous supercomputers. With zero code changes or user intervention, it ensures optimal performance from day one—on any system. Portable across architectures, resource managers, and MPI libraries, mpibind simplifies complex setups and eliminates the need for time-consuming tuning. It seamlessly integrates with existing workflows, letting scientists focus on groundbreaking research—from modeling natural disasters to advancing healthcare—instead of battling system configs.



Transcript

00:00:11 Supercomputers play a vital role in scientific research and discovery, helping us to design new materials, simulate natural disasters, and predict disease outcomes. To meet the increasing demands of science, supercomputers are growing both in scale and complexity. Today's most advanced systems include multiple types of processors, memory hierarchies,

00:00:35 and memory domains. To make efficient and effective use of these machines, scientists face significant challenges in readying their applications to run on any given system. A single application can consist of millions of lines of code and hundreds of thousands of processes. For optimal performance, each part of the application needs to be mapped to specific hardware resources, then mapped

00:01:02 again if it runs on a different supercomput. This costly, time-conuming task is daunting without intervention. The solution is MPI bind developed at Lawrence Liverour National Laboratory. MPI Bind takes care of mapping, optimizing an application's performance regardless of the machine it's running on. MPIB bind makes using modern high-performance computing systems easy,

00:01:29 so HPC application scientists can focus on their science. If a supercomput is like a mountain range, all of its peaks, crevices, and rock formations represent hardware components like memory domains, cores, CPUs, and GPUs. MPI bind automatically maps out the supercomput's topology, showing each process in the application where to go

00:01:59 for the best performance, even for the most complex multifysics applications. and the most advanced computing architectures. The performance of scientific applications will suffer if they don't use compute resources in a smart way. First, MPIBN uses a memory tree which is built based on the memory hierarchy while compute resources are attached to their associated memory or

00:02:30 cache. Second, MPI bind distributes application processes over the memory hierarchy using the memory tree in a top-down manner. This approach maximizes the amount of memory and cache available to each process. Because NPI knows the hardware topology, it takes advantage of the fastest compute resources with the highest proximity to the memory system.

00:03:02 Scientists do not need to learn any of these hardware details or rewrite their code. They simply add this phrase to their launch command to enable MPI bind. In cases where MPI bind is already enabled by default on the supercomputers resource manager, the user doesn't need to invoke MPI bind at all. It will work automatically. HPC technology is advancing rapidly and

00:03:36 scientists need to use new supercomputers with unfamiliar hardware configurations. Think of the industry shift from CPUbased supercomputers to heterogeneous supercomputers with both CPUs and GPUs. Mapping applications to a new architecture efficiently is difficult and can take days, weeks or even months. NPIB takes that burden off the users by

00:04:02 mapping applications automatically and without user intervention to any supercomput system so that applications can run on day one. And because the mapping is based on the machine topology, it is optimized for that system. This increases the productivity of application scientists over a wide range of

00:04:29 supercomputers. MPI bind has a large customer base in the supercomputing community both in industry and at national laboratories. Scientific application teams use MPI bind for a range of challenging problems on many different computing architectures including the most advanced exoscale systems. HPE and Lawrence Livermore we have been working together on the

00:04:55 deployment of the largest known system in the world right now El Capitan. This is a very large system with over 40,000 processing components where scalability is really one of our main focuses for this system. Any delay in any one of the tens of thousands of processes that are participating in this application can cause the entire application to slow down. The main issue we run into is that

00:05:25 of operating system noise. If any OS noise gets in the way of any one of the potentially 40,000 different parallel activities going on at the same time, then the entire synchronization is delayed. And so by doing the OS noise mitigation work, we isolate those background activities to cores and hardware threads separate from those that we'll use for the application. And

00:05:53 this is where NPI bind comes into play. NPI bind provides an easy way for the users to place their jobs on the nodes of the system automatically avoiding where these overhead activities have been placed.