The Inductive Monitoring System (IMS) software tool uses data mining techniques to automatically characterize nominal system operation by analyzing archived system data. These nominal characterizations are then used to perform near-real-time system health monitoring or to analyze archived system data to detect anomalies in system behavior as compared with previous nominal behavior.
Automatic system health monitoring can significantly benefit from an accurate characterization or model of expected system behavior. IMS was motivated by the difficulty of producing detailed health monitoring and diagnostic models of some system components due to complexity or unavailability of design information. Most current health monitoring schemes simply monitor system parameters to ensure they do not exceed predetermined extreme thresholds and may not be able to detect early signs of anomalous behavior. Since off-nominal system data is frequently difficult to obtain, IMS is designed to build a monitoring knowledge base using only nominal system data. The resulting knowledge base clearly shows relationships between system parameters during nominal operation, and is easily processed to provide real-time or near-real-time monitoring ability in most circumstances.
IMS considers system parameter values within the context of other related system parameters by grouping multiple parameters into a vector and analyzing them simultaneously. The geometric distance between two system parameter vectors provides a measure of their similarity, with smaller distances indicating similar system behavior. IMS uses a machine learning technique called clustering, which forms nearby nominal parameter vectors into groups called clusters. Calculating the distance from individual parameter vectors collected from the system to a previously established set of nominal system clusters is an efficient method for assessing the current system operating health.
Updated cluster indexing and retrieval techniques provide a new, efficient approach to finding the closest cluster (i.e., region of a vector space) to a given vector for use in distance-based anomaly detection systems. The new techniques provide faster response to closest cluster queries than previous techniques and require less computer memory. This allows software, such as IMS, to process data at higher data rates and/or using fewer computing resources, thereby increasing the potential applications for the software. It also allows the use of larger, more detailed system characterization models (containing more clusters) on a given computer system than previous techniques, which enables more precise system analysis capability.
This work was done by David Iverson of Ames Research Center.