Government Resource for Algorithm Verification, Independent Test, and Evaluation (GRAVITE) system is a National Oceanic and Atmospheric Ad - ministration (NOAA) system, developed and deployed by Joint Polar Satellite System (JPSS) Ground Project to support Calibration and Validation (Cal/Val), Data Quality Monitoring, and Algorithm Investigation, Tuning and Integration. GRAVITE enables novice and expert users to discover and obtain data easily by using standard protocols. The Monitor is a component of the GRAVITE version 3.0 (GV3.0) system. It monitors the status of the various components including those used from Apache OODT (Object Oriented Data Technology) and provides certain statistics. GV3.0 system has very large volumes of data arriving from multiple sources, and requires many PGEs (Product Generation Executable) to be run against the data. Operators must be able to analyze the incoming data without needing to examine every file individually. The general problem of monitoring a complex software environment via status and statistics is common and well documented.

GV3.0 system utilizes various status and statistics systems to monitor progress and health of the overall system. Via the Web interface, this component provides operators and privileged users situational awareness on the GV3.0 system. It provides operators a snapshot and a historical perspective to the status of the system, allowing them to identify known problems, and analyze unforeseen problems with the data flow.

The following statuses are currently monitored:

  • Current status (running/not running) of each file manager (Apache OODT component)
  • Current status (running/not running, average time per crawl) of each crawler (Apache OODT component)
  • Current status (running/not running) of the workflow manager (Apache OODT component)
  • Current status (running/not running, number of jobs in queue) of the resource manager (Apache OODT component)
  • Current status (running/not running, number of jobs simultaneously running) of each execution node
  • Current status of each PGE execution time period (has not run, is preparing to run, is running, is completed, error code (if any), and time to prepare and time to run)

The following statistics are generated and reported:

  • Number of files ingested per time period per landing zone
  • Number of bytes ingested per time period per landing zone
  • Number of files failed to ingest per landing zone at a specific time
  • Number of bytes failed to ingest per landing zone at a specific time
  • Number of files per orbit and subtype received by system
  • Number of files per day by source, type, sub-type, and user-type
  • Number of bytes per day by source, type, sub-type and user-type
  • Number of subscriptions served by unit time

Status is communicated via an Extensible Markup Language Remote Procedure Call (XML-RPC) component of each server.

This work was done by Peyush Jain, Richard Ullman, and Gyanesh Chander of NASA Goddard Space Flight Center; and David Trang and Wayne McCullough of Global Science & Technology, Inc. GSC-16928-1