2012

Physics Mining of Multi-Source Data Sets

Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental-science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission re-planning to optimize the allocation of observational resources.

The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool’s outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as “physics-mining of data.” The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.

The innovations include: (1) Physics mining algorithms, enabling derivation of analytical relations and physical models from observational data; (2) Automated, parallel algorithms, enabling a high degree of automation and parallelization, scaling to large spatial domains well-suited to change and gap detection; (3) Local versus global modeling, to generate locally optimal models appropriate to a specific geospatial region accounting for the unique setting and conditions; (4) Fusion of multi-source, multi-type data that yield higher-resolution measures than ever before by fusing synoptic imagery and independent time-series measurements; and (5) Calculation of Palmer’s Drought Severity Index Analogue.

Successful completion of this project will lead to a major breakthrough in the climate study in particular, and to analysis of multi-source data as applied to the hydrologic cycle affecting climate change impacts and resource management.

This work was done by John Helly, Homa Karimabadi, and Tamara Sipes of SciberQuest, Inc. for Goddard Space Flight Center. GSC-15802-1

This Brief includes a Technical Support Package (TSP).

Physics Mining of Multi-Source Data Sets (reference GSC-15802-1) is currently available for download from the TSP library.

Please Login at the top of the page to download.

 

White Papers

Increasing Automotive Safety Through Embedded Radar Technologies
Sponsored by Freescale
HIG™: Combining the Benefits of Inductive and Resistive Heating
Sponsored by iTherm Technologies
Bridging the Armament Test Gap
Sponsored by Marvin Test Solutions
The Road to Lightweight Vehicles
Sponsored by HP
Aerospace Tooling: 3D Technology Enables Virtual Design
Sponsored by FARO
Medical Capabilities Brochure
Sponsored by Nordson EFD

White Papers Sponsored By: