Algorithms have been developed to enable a robotic vision system to recognize, in real time (at a rate between 0.5 and 2 frames per second), known objects lying on the ground. In the original intended application, the algorithms would be executed by off-the-shelf computer hardware aboard a robotic vehicle that would traverse military ordnance-testing ranges to search for unexploded bombs. A stereoscopic pair of color video cameras aboard the vehicle would acquire images of the terrain near the vehicle, and the algorithms would process the digitized images to recognize the bombs by their known size, shape, and color. The algorithms may also be adaptable to other, similar robotic-vision applications — for example, automated recognition of color traffic signs for alerting drivers in automobiles.

Candidate Objects are identified in an image, then resampled to a canonical size and orientation for further processing.

The methodology implemented by the algorithms can be summarized as follows: First, raw data from a pair of color stereoscopic images are subjected to rapid preliminary processing to detect candidate locations (that is, locations to be examined more thoroughly for the presence of bombs). Once the candidates have been detected, additional computations are performed to reduce false alarms, reason about the remaining available image data, and make a final decision about each candidate.

The preliminary processing includes several steps that result in the generation of range data from the disparity between the left and right images of the stereoscopic pair. The stereoscopic range data are used initially, along with other abstracted data, to place bounds on the sizes of objects in the scene; this makes it possible to eliminate, from further consideration, all parts of the scene that do not contain candidate objects within the size range of the objects of interest (the bombs in the original application). This elimination reduces the search space and reduces the incidence of false alarms.

Next, the color of each pixel in the remaining search space is quantified by computing a unit vector in a three-axis color space from its red, green, and blue brightnesses. Each pixel is then classified as either like or unlike an object of interest, depending on whether its unit color vector does or does not lie within that volume in the color space that represents the range of anticipated variation of color of the object of interest, given anticipated variations in lighting, viewing angles, and natural discoloration from weathering. Candidates are identified by locating blocks of contiguous pixels that have been so classified.

After detecting candidate locations, a variety of verification software modules can be applied to reduce false alarms. Although verification is more computation-intensive than are the preceding steps, the verification process does not greatly increase the overall computation time because much of the image has been eliminated from consideration in the preceding steps.

The first step in the verification process is to compute the dominant orientation of the object at each candidate location, then use the resulting information, along with the range data, to resample the candidate object at a canonical scale and orientation (see figure). Each resampled candidate object is then subjected to a series of tests that rate the spatial distribution of color, the likelihood that edges consistent with those of the objects of interest are present, the height of the candidate object, and the contrast between the object and the background. Weighted sums of the quantitative results of these tests are used to compute the probability that the candidate object is one of the objects of interest; the candidate is deemed to be an object of interest if the computed probability exceeds a predetermined threshold value.

The algorithms were tested on 350 images acquired at a live-fire test range near Nellis Air Force base. (Training for the candidate-detection stages was performed on a set of images collected at the same site one year earlier.) Overall, 324 instances of 75 different bombs of the same type appear in the test set. Each bomb was detected in at least one of the images that showed it. Several false negatives occurred in instances in which the bombs lay at significant distances from the cameras and thus yielded small images. In these cases, the bombs were always detected when the cameras traveled closer to them. In addition, 19 false alarms were detected. Of the candidate objects reported, 92.6 percent were found to be bombs.

This work was done by Clark F. Olson of Caltech for NASA's Jet Propulsion Laboratory. For further information, access the Technical Support Package (TSP) free on-line at  under the Information Sciences category.


This Brief includes a Technical Support Package (TSP).
Algorithms for Recognition of Objects in Color Stereo Images

(reference NPO-20754) is currently available for download from the TSP library.

Don't have an account? Sign up here.