A bilateral subtraction filter has been implemented as a hardware module in the form of a field-programmable gate array (FPGA). In general, a bilateral subtraction filter is a key subsystem of a high-quality stereoscopic machine vision system that utilizes images that are large and/or dense. Bilateral subtraction filters have been implemented in software on general-purpose computers, but the processing speeds attainable in this way — even on computers containing the fastest processors — are insufficient for real-time applications. The present FPGA bilateral subtraction filter is intended to accelerate processing to real-time speed and to be a prototype of a link in a stereoscopic-machine-vision processing chain, now under development, that would process large and/or dense images in real time and would be implemented in an FPGA.

A Bilateral Subtraction Filter is implemented in an FPGA that performs parallel pipeline computations in a moving 9×9-pixel window.
In terms that are necessarily oversimplified for the sake of brevity, a bilateral subtraction filter is a smoothing, edge-preserving filter for suppressing low-frequency noise. The filter operation amounts to replacing the value for each pixel with a weighted average of the values of that pixel and the neighboring pixels in a predefined neighborhood or window (e.g., a 9×9 window). The filter weights depend partly on pixel values and partly on the window size.

The present FPGA implementation of a bilateral subtraction filter utilizes a 9×9 window. This implementation was designed to take advantage of the ability to do many of the component computations in parallel pipelines to enable processing of image data at the rate at which they are generated. The filter can be considered to be divided into the following parts (see figure):

  • An image pixel pipeline with a 9×9-pixel window generator,
  • An array of processing elements;
  • An adder tree;
  • A smoothing-and-delaying unit; and
  • A subtraction unit.

After each 9×9 window is created, the affected pixel data are fed to the processing elements. Each processing element is fed the pixel value for its position in the window as well as the pixel value for the central pixel of the window. The absolute difference between these two pixel values is calculated and used as an address in a lookup table. Each processing element has a lookup table, unique for its position in the window, containing the weight coefficients for the Gaussian function for that position. The pixel value is multiplied by the weight, and the outputs of the processing element are the weight and pixel-value·weight product. The products and weights are fed to the adder tree. The sum of the products and the sum of the weights are fed to the divider, which computes the sum of products ÷ the sum of weights. The output of the divider is denoted the bilateral smoothed image.

The smoothing function is a simple weighted average computed over a 3×3 subwindow centered in the 9×9 window. After smoothing, the image is delayed by an additional amount of time needed to match the processing time for computing the bilateral smoothed image. The bilateral smoothed image is then subtracted from the 3×3 smoothed image to produce the final output.

The prototype filter as implemented in a commercially available FPGA processes one pixel per clock cycle. Operation at a clock speed of 66 MHz has been demonstrated, and results of a static timing analysis have been interpreted as suggesting that the clock speed could be increased to as much as 100 MHz.

This work was done by Andres Huertas, Robert Watson, and Carlos Villalpando of Caltech and Steven Goldberg of Indelible Systems for NASA’s Jet Propulsion Laboratory. NPO-45906