Vector-Ordering Filter Procedure for Data Reduction
- Created on Monday, 01 December 2003
The essential characteristics of original large sets of data are preserved.
The vector-ordering filter (VOF) technique involves a procedure for sampling a large population of data vectors to select a subset of data vectors that fully characterize the state space of the large population. The VOF technique enables a large reduction of the volume of data that must be handled in the automated-monitoring system and method discussed in the two immediately preceding articles. In so doing, the VOF technique enables the development of data-driven mathematical models of a monitored asset from sets of data that would otherwise exceed the memory capacities of conventional engineering computers.
Data-driven mathematical models have been shown to offer high fidelity for purposes of control and monitoring of assets. In practice, a collection of asset-operating observations is acquired with the intention that the collection contain observations characteristic of the full dynamic range of operation of the asset. Often, such a collection contains an extremely large number of observations, many of which are redundant. The VOF technique fills the need for a means to extract, from the original collection of observational data, a reduced data matrix that excludes redundant data while maintaining the full statistical character and dynamic range of the original data. The reduced data matrix can then be used as the input data for development of a mathematical model of the monitored asset, or as training data for a neural-network substitute for an explicit mathematical model of the asset. Alternatively, the reduced data matrix can, itself, be used directly as a mathematical model of the monitored asset, as is commonly done in multivariate state-estimation techniques.
The original data are collected from the asset over a range of operating states and are put in matrix form. Each column vector in the original data matrix represents the signal values acquired at a particular operational state of the asset. Thus, the number of columns of the original data matrix equals the number of observed states and the number of rows in this matrix equals the number of signals acquired at each observation. In the VOF technique, one extracts the reduced data matrix from the original data matrix through the selection of a representative subset of the column (state) vectors.
In its simplest form, the VOF procedure is a two-stage procedure, in the first stage of which one selects those data vectors that characterize the extrema present in the original data. At the beginning of the first stage, one finds the extrema of each signal as represented by the minimum and maximum values in the corresponding row of the original data matrix. The column vectors that contain these values are selected as candidates for inclusion in the reduced data matrix. Before a candidate column vector is added to the reduced data matrix, it is compared with those column vectors already in the reduced data matrix to ensure that only one copy of that vector ends up in the reduced data matrix. In the second stage, one orders the column vectors of the original data matrix by their Euclidean norms and then selects a subset of the vectors according to a spacing criterion. All column vectors selected in this way are compared to those vectors selected during the first stage. Only those column vectors that were not already included in the reduced data matrix during the first stage are added to the reduced data matrix.
Practical sets of data tend to be so large that excessive computer memory would be necessary for a single pass of the two-stage VOF procedure over all the data. In such a case, the VOF procedure can be applied recursively to successive subsets of the original data that are small enough to fit in the available memory. The figure presents plots from an example of a two-pass application of the VOF technique to some space-shuttle engine vibration data.