The Pathcalc computer program estimates the time needed to execute a given application program on a parallel computer of given computation and network capabilities. Pathcalc can be used to analyze the effects of changes in such parameters as central-processing-unit (CPU) speed, network bandwidth, and network latency. Pathcalc is written in Java and should be executable on most computers.
Pathcalc could be used to determine how long it would take to execute the same application program on a different parallel computer or whether a specified faster network or a faster CPU could execute the program in an acceptably short time. It could also be used to determine what part of a parallel system is slowing down execution of a given application program the most: For example, by artificially setting the CPU speed very high, one could determine how much time is used in communication; or by artificially setting the communication speed very high, one could determine how much time is consumed in CPU operations.
It is not necessary to understand the application program or to mathematically model the network in order to use Pathcalc. All one needs is the trace files (one such file for each CPU of the computer) from a previous run of the application program. Pathcalc then generates its estimate on the basis of the trace files and the network parameters provided by the user.
The estimate is valid only (1) for a computer with the same number of nodes used to generate the trace files; (2) provided that message passing is restricted to such simple routines as send, receive, and barrier calls; and (3) provided that the execution of the application program can be relied upon to always follow the same path through the code, regardless of changes in network response times. In situations in which these restrictions are acceptable, Pathcalc offers advantages of simplicity and speed over a number of other programs that estimate execution times of application programs; this is because unlike those estimators, Pathcalc uses only the trace information instead of executing the application programs themselves.
This work was done by Paul Springer of Caltech for NASA's Jet Propulsion Laboratory. NPO-20237