An alternative approach has been devised for encoding image data in compliance with JPEG 2000, the most recent still-image data- compression standard of the Joint Photographic Experts Group. Heretofore, JPEG 2000 encoding has been implemented by several related schemes classified as rate-based distortion-minimization encoding. In each of these schemes, the end user specifies a desired bit rate and the encoding algorithm strives to attain that rate while minimizing a mean squared error (MSE). While rate-based distortion minimization is appropriate for transmitting data over a limited-bandwidth channel, it is not the best approach for applications in which the perceptual quality of reconstructed images is a major consideration. A better approach for such applications is the present alternative one, denoted perceptual distortion control, in which the encoding algorithm strives to compress data to the lowest bit rate that yields at least a specified level of perceptual image quality.
Some additional background information on JPEG 2000 is prerequisite to a meaningful summary of JPEG encoding with perceptual distortion control. The JPEG 2000 encoding process includes two subprocesses known as tier-1 and tier-2 coding. In order to minimize the MSE for the desired bit rate, a rate-distortion-optimization subprocess is introduced between the tier-1 and tier-2 subprocesses. In tier-1 coding, each coding block is independently bitplane coded from the most-significant-bit (MSB) plane to the least-significant-bit (LSB) plane, using three coding passes (except for the MSB plane, which is coded using only one "clean up" coding pass). For M bit planes, this subprocess involves a total number of (3M − 2) coding passes. An embedded bit stream is then generated for each coding block. Information on the reduction in distortion and the increase in the bit rate associated with each coding pass is collected. This information is then used in a rate-control procedure to determine the contribution of each coding block to the output compressed bit stream.
In tier-2 coding, the results of those coding passes for each coding block that have not been discarded are organized into an output compressed bit stream. With a carefully optimized implementation of a discrete wavelength transform, the embedded block coding tends to dominate the whole encoding time; consequently, prior JPEG 2000 encoding algorithms waste computational power and memory on those coding passes that are eventually discarded. This concludes the background information.
A complete description of JPEG encoding with perceptual distortion control would greatly exceed the space available for this article, making it necessary to summarize briefly: The multiresolution wavelet decomposition and the two-tier coding structure of JPEG 2000 are amenable to incorporation of perceptual distortion control. In the present approach, one strives to determine the number of coding passes needed for each coding block by use of a perceptual model of the human vision system. Then only that number of (and no more) coding passes need be made in the tier-1 encoding.
A basic idea of the use of the perceptual model of the human vision system is to hide the coding distortion beneath detection thresholds, typically by exploiting the masking properties of the human visual system and establishing detection thresholds of just-noticeable distortion and minimally noticeable distortion based on psychophysical experiments. Among the masking properties included in the model are luminance masking [also known as light adaptation (in which the detection threshold varies with background light intensity)] and contrast making (in which the visibility of an image component is affected by other image components). The model also incorporates a perceptual distortion metric that takes account of spatial and spectral summations of quantization errors.
Experimental data have confirmed the expectation that in addition to yielding consistent image quality, JPEG 2000 encoding with perceptual distortion control makes it possible to do so at bit rates lower than those of JPEG 2000 rate-based distortion-minimization encoding. The figure presents comparative plots of such data, showing that the bit rate for a given level of normalized perceptual distortion is lower for perceptual distortion control.
This work was done by Andrew B. Watson of Ames Research Center and Zhen Liu and Lina J. Karam of Arizona State University.
This invention is owned by NASA, and a patent application has been filed. Inquiries concerning rights for the commercial use of this invention should be addressed to
the Ames Technology Partnerships Division at (650) 604-2954.
Refer to ARC-15522-1.