An algorithm based on Chebyshev polynomials effects lossy compression of time-series data or other one-dimensional data streams (e.g., spectral data) that are arranged in blocks for sequential transmission. The algorithm was developed for use in transmitting data from spacecraft scientific instruments to Earth stations. In spite of its lossy nature, the algorithm preserves the information needed for scientific analysis. The algorithm is computationally simple, yet compresses data streams by factors much greater than two. The algorithm is not restricted to spacecraft or scientific uses: it is applicable to time-series data in general. The algorithm can also be applied to general multidimensional data that have been converted to time-series data, a typical example being image data acquired by raster scanning. However, unlike most prior image-data-compression algorithms, this algorithm neither depends on nor exploits the two-dimensional spatial correlations that are generally present in images.
In order to understand the essence of this compression algorithm, it is necessary to understand that the net effect of this algorithm and the associated decompression algorithm is to approximate the original stream of data as a sequence of finite series of Chebyshev polynomials. For the purpose of this algorithm, a block of data or interval of time for which a Chebyshev polynomial series is fitted to the original data is denoted a fitting interval. Chebyshev approximation has two properties that make it particularly effective for compressing serial data streams with minimal loss of scientific information: The errors associated with a Chebyshev approximation are nearly uniformly distributed over the fitting interval (this is known in the art as the “equal error property”); and the maximum deviations of the fitted Chebyshev polynomial from the original data have the smallest possible values (this is known in the art as the “min-max property”).
The algorithm performs the same sequence of calculations on each successive data block (see figure). For each block, the first step is a calculation of a Chebyshev transform; that is, a matrix of coefficients of a Chebyshev series. This involves calculation of linear combinations of data samples with the applicable Chebyshev coefficients. The Chebyshev coefficients are fixed and known, making it possible to reduce the computational burden by computing them in advance, storing them in lookup tables, and retrieving them from the lookup tables as needed. In the next step, the matrix of coefficients is thresholded: only those coefficients larger than a threshold specified by the user are retained. The retained coefficients are then quantized to reduce their representations to no more than a number of bits specified by the user.
Next, there is generated a bit-control word, which is to be used during the subsequent decompression process to indicate the locations for insertion of the quantized retained coefficients and for insertion of place holders (zeroes) at locations of coefficients that are not retained. The bit-control word is then encoded by a lossless compression technique; this step can significantly increase the overall compression ratio without introducing additional loss. If there are more data blocks to be processed, then the process as described thus far is repeated for the next block. If there are no more blocks to be processed, the compressed data and their control words are transmitted.
The results obtained by use of the algorithm depend partly on three parameters: the block size (the number of data samples in a block), the aforementioned threshold value, and the aforementioned number of quantization bits. By adjusting the values of these parameters for different types of data, one can obtain usefully large compression ratios with minimal errors. Higher threshold values always result in greater compression ratios at the expense of quality of reconstructed signals. In creasing numbers of quantization bits generally reduces compression ratios but yields reconstructed signals of higher quality. Increasing block sizes yields more-varied results; in general, larger compression ratios are associated with larger blocks because fewer block maxima and minima are stored.
This work was done by S. Edward Hawkins III and Edward Hugo Darlington of Johns Hopkins University Applied Physics Laboratory for Goddard Space Flight Center. For further information, contact the Goddard Innovative Partnerships Office at (301) 286-5810. GSC-14820-1