Algorithm for Compressing Time-Series Data
- Created: Friday, 01 June 2012
This algorithm is generally applicable to many types of data.
An algorithm based on Chebyshev polynomials effects lossy compression of time-series data or other one-dimensional data streams (e.g., spectral data) that are arranged in blocks for sequential transmission. The algorithm was developed for use in transmitting data from spacecraft scientific instruments to Earth stations. In spite of its lossy nature, the algorithm preserves the information needed for scientific analysis. The algorithm is computationally simple, yet compresses data streams by factors much greater than two. The algorithm is not restricted to spacecraft or scientific uses: it is applicable to time-series data in general. The algorithm can also be applied to general multidimensional data that have been converted to time-series data, a typical example being image data acquired by raster scanning. However, unlike most prior image-data-compression algorithms, this algorithm neither depends on nor exploits the two-dimensional spatial correlations that are generally present in images.
Chebyshev Transforms are calculated in this algorithm, which effects lossy compression of data. Parameters of the algorithm can be adjusted to balance accuracy versus degree of compression." class="caption" align="right">In order to understand the essence of this compression algorithm, it is necessary to understand that the net effect of this algorithm and the associated decompression algorithm is to approximate the original stream of data as a sequence of finite series of Chebyshev polynomials. For the purpose of this algorithm, a block of data or interval of time for which a Chebyshev polynomial series is fitted to the original data is denoted a fitting interval. Chebyshev approximation has two properties that make it particularly effective for compressing serial data streams with minimal loss of scientific information: The errors associated with a Chebyshev approximation are nearly uniformly distributed over the fitting interval (this is known in the art as the “equal error property”); and the maximum deviations of the fitted Chebyshev polynomial from the original data have the smallest possible values (this is known in the art as the “min-max property”).
The algorithm performs the same sequence of calculations on each successive data block (see figure). For each block, the first step is a calculation of a Chebyshev transform; that is, a matrix of coefficients of a Chebyshev series. This involves calculation of linear combinations of data samples with the applicable Chebyshev coefficients. The Chebyshev coefficients are fixed and known, making it possible to reduce the computational burden by computing them in advance, storing them in lookup tables, and retrieving them from the lookup tables as needed. In the next step, the matrix of coefficients is thresholded: only those coefficients larger than a threshold specified by the user are retained. The retained coefficients are then quantized to reduce their representations to no more than a number of bits specified by the user.