LEGION is a lightweight C-language software library that enables distributed asynchronous data processing with a loosely coupled set of compute nodes. Loosely coupled means that a node can offer itself in service to a larger task at any time and can withdraw itself from service at any time, provided it is not actively engaged in an assignment. The main program, i.e., the one attempting to solve the larger task, does not need to know up front which nodes will be available, how many nodes will be available, or at what times the nodes will be available, which is normally the case in a “volunteer computing” framework. The LEGION software accomplishes its goals by providing message-based, inter-process communication similar to MPI (message passing interface), but without the tight coupling requirements. The software is lightweight and easy to install as it is written in standard C with no exotic library dependencies.
LEGION has been demonstrated in a challenging planetary science application in which a machine learning system is used in closed-loop fashion to efficiently explore the input parameter space of a complex numerical simulation. The machine learning system decides which jobs to run through the simulator; then, through LEGION calls, the system farms those jobs out to a collection of compute nodes, retrieves the job results as they become available, and updates a predictive model of how the simulator maps inputs to outputs. The machine learning system decides which new set of jobs would be most informative to run given the results so far; this basic loop is repeated until sufficient insight into the physical system modeled by the simulator is obtained.
This work was done by Michael C. Burl of Caltech for NASA’s Jet Propulsion Laboratory. This software is available for commercial licensing. Please contact Daniel Broderick of the California Institute of Technology at
This Brief includes a Technical Support Package (TSP).

LEGION: Lightweight Expandable Group of Independently Operating Nodes
(reference NPO-47910) is currently available for download from the TSP library.
Don't have an account?
Overview
The document is a Technical Support Package from NASA's Jet Propulsion Laboratory (JPL) detailing the LEGION system, which stands for Lightweight, Expandable Group of Independently Operating Nodes. LEGION is designed to facilitate distributed computing, allowing multiple nodes to work together on complex tasks, particularly in the context of scientific simulations and engineering applications.
The primary focus of the document is on the application of LEGION in physics-based simulation codes, which are essential for modeling complex systems that are otherwise difficult to study. These simulations, such as asteroid collision models, provide high-fidelity representations of system behavior but are often time-consuming, taking up to a full day for a single trial on a single CPU. To address this inefficiency, the document discusses a directed exploration strategy that intelligently selects simulation trials based on previous results, thereby optimizing the exploration of input parameter space.
The document highlights the closed-loop system where simulation trials are run, results are aggregated, and active learning is employed to determine the most valuable new trials to conduct. This iterative process builds a body of labeled training data, which is then used to create a simplified predictive model of the system's behavior.
While LEGION offers significant advantages, the document also outlines current limitations. For instance, it assumes that all nodes have read/write access to a shared file system, which can restrict the inclusion of remote nodes. Additionally, there are concerns regarding concurrency safeguards to prevent race conditions and the efficiency of message passing, particularly in scenarios where rapid responses are required.
The document also draws comparisons to other distributed computing frameworks, such as SETI@home and BOINC, noting that while LEGION provides a useful communication mechanism, further development is needed to enhance its capabilities for remote node participation.
In summary, the document presents LEGION as a promising tool for enhancing distributed computing in scientific simulations, while also acknowledging the challenges that need to be addressed to fully realize its potential in various applications.

