A recently developed theoretical and computational method is especially well suited for analyzing time-series data that represent nonstationary and nonlinear processes. The method stands in contrast to classical methods, including Fourier analysis, that are generally applicable only to periodic or stationary data that represent linear processes. The present method is based principally on the concept of empirical mode decomposition (EMD), according to which any complicated set of data can be decomposed into a finite (and often small) number of functions, called "intrinsic mode functions" (IMFs), that admit well-behaved Hilbert transforms.
This decomposition method is adaptive and, therefore, highly efficient. The main innovations embodied in this method are (1) the introduction of the IMFs, which are based on local properties of the signal to be analyzed and which give meaning to the concept of instantaneous frequency; and (2) the introduction of instantaneous frequencies for complicated sets of data, which frequencies eliminate a need for spurious harmonics to represent nonlinear and nonstationary processes.
Without going into detail that would exceed the scope of this article, an IMF can be loosely defined as an oscillation mode that is embedded in the data to be analyzed and that is associated with a local time scale of the data. An IMF need not be a narrow-band signal; it can be amplitude- and/or frequency-modulated and can even be nonstationary. The formal criteria for identifying an IMF are that (1) in the whole set of data, the numbers of extrema and zero crossings must be equal or differ by 1 at most, and (2) at any point, the mean value of the envelope defined by the local maxima and the envelope defined by the local minima must be zero.
The decomposition for a given set of data is based on direct extraction of the energies associated with various time scales and can be viewed as an expansion of the data in terms of the corresponding IMFs, which are derived from the data. The expansion can be linear or nonlinear, as dictated by the data, and it is complete and almost orthogonal. Most important, because the IMFs are derived from the data, the expansion is adaptive.
The locality and adaptivity are necessary for expanding nonlinear and nonstationary time series. The local energies and the instantaneous frequencies derived from the IMFs through Hilbert transforms can be used to construct a full energy-frequency-time distribution of the data. Such a distribution is denoted a Hilbert spectrum.
The EMD is based on three assumptions: (1) the signal has at least two extrema — at least one maximum and at least one minimum, (2) the characteristic time scale is defined by the time between the extrema, and (3) if the data are devoid of extrema but contain inflection points, then the data can be differentiated once or more times to reveal extrema. Final results can be obtained by integration(s) of the components.
The essence of the EMD is to identify the intrinsic oscillatory modes by their characteristic time scales, then decompose the data accordingly. The time scales are identified as the intervals between the successive alternations of local maxima and minima and by the intervals between successive zero crossings. The interlaced local extrema and zero crossings contribute to the complicated set of data to be analyzed: one undulation rides on another, and they, in turn, ride on still other undulations, and so on. Each of these undulations defines a characteristic scale of the data; it is intrinsic to the process.
The IMFs are extracted from the data in a sifting process (see figure). By virtue of the definition of "IMF", the decomposition can be effected simply by use of the envelopes defined by the local maxima and minima separately. Once the extrema are identified, all the local maxima and minima, respectively, are connected by a cubic spline line as the upper and lower envelope, respectively. The upper and lower envelopes should cover all the data between them. Their mean is designated as m10. The difference between the data and m10 is denoted the first component, given by h10 = X(t) -m10, where X(t) is the datum at time t.
Ideally, h10 should be an IMF. In reality, overshoots and undershoots are common; they can generate new extrema and shift or exaggerate previously discovered extrema. There can remain negative local maxima and positive local minima. Thus, it becomes necessary to repeat the sifting, using h10 as the data: h11 = h10- m11, where m11 is the local mean of the upper and lower envelope of h10. In general, it may be necessary to perform the sifting k times to arrive at the first IMF component, c1 = h1k = h1(k -1) -m1k. In the example of the figure, it is necessary to repeat the sifting up to k = 9 to obtain the first IMF component; that is, c1 = h18.
The first IMF component should be the one associated with the shortest time scale in the data. The second IMF component (c2) associated with a longer time scale, is obtained by sifting the first residue signal r1 = X(t) -c1. In general, the nth IMF component (cn for n>1), associated with a time scale longer than that of the (n -1)st component, is obtained by sifting rn = rn -1 -cn. The sifting can be stopped when either of the following criteria is satisfied: (1) cn or rn becomes smaller than a predetermined minimum value of substantial consequence or (2) rn becomes a monotonic function from which no more IMF can be extracted. The sift process can be used as a time-domain filter.
This work was done by Norden E. Huang and Steven Long of Goddard Space Flight Center. For further information, access the Technical Support Package (TSP) free on-line at www.nasatech.com/tsp under the Information Sciences category.
This invention is owned by NASA, and a patent application has been filed. Inquiries concerning nonexclusive or exclusive license for its commercial development should be addressed to
the Patent Counsel
Goddard Space Flight Center; (301) 286-7351.
Refer to GSC-13817.