A system of electronic hardware and software, now undergoing development, automatically estimates the location of a robotic land vehicle in an urban environment using a somewhat imprecise map, which has been generated in advance from aerial imagery. This system does not utilize the Global Positioning System and does not include any odometry, inertial measurement units, or any other sensors except a stereoscopic pair of black-and-white digital video cameras mounted on the vehicle. Of course, the system also includes a computer running software that processes the video image data.
The software consists mostly of three components corresponding to the three major image-data-processing functions:
This component automatically tracks point features in the imagery and computes the relative motion of the cameras between sequential image frames. This component incorporates a modified version of a visual-odometry algorithm originally published in 1989. The algorithm selects point features, performs multiresolution area-correlation computations to match the features in stereoscopic images, tracks the features through the sequence of images, and uses the tracking results to estimate the six-degree-of-freedom motion of the camera between consecutive stereoscopic pairs of images (see figure).
Urban Feature Detection and Ranging
Using the same data as those processed by the visual-odometry component, this component strives to determine the three-dimensional (3D) coordinates of vertical and horizontal lines that are likely to be parts of, or close to, the exterior surfaces of buildings. The basic sequence of processes performed by this component is the following:
- An edge-detection algorithm is applied, yielding a set of linked lists of edge pixels, a horizontal-gradient image, and a vertical-gradient image.
- Straight-line segments of edges are extracted from the linked lists generated in step 1. Any straight-line segments longer than an arbitrary threshold (e.g., 30 pixels) are assumed to belong to buildings or other artificial objects.
- A gradient-filter algorithm is used to test straight-line segments longer than the threshold to determine whether they represent edges of natural or artificial objects. In somewhat oversimplified terms, the test is based on the assumption that the gradient of image intensity varies little along a segment that represents the edge of an artificial object.
- A roof-line-detection algorithm identifies, as candidate roof lines, line segments (a) that exceed a threshold length and (b) above which there are no other such lines.
- The 3D positions of line segments detected in the preceding steps are computed from either (a) ordinary stereoscopic imagery acquired simultaneously by the two cameras or (b) wide-baseline stereoscopic imagery synthesized from imagery acquired in two successive frames, using relative camera positions determined by visual-odometry. The choice between (a) and (b) is made on the basis of which, given certain parameters of the viewing geometry, is expected to enable a more accurate triangulation.
- A heuristic pruning algorithm filters the remaining line segments: All lines that are not approximately vertical or horizontal are discarded, horizontal lines longer than 2 m are selected, vertical lines that extend above 2 m are selected, and sets of parallel lines are selected.
The outputs of the visual-odometry and urban-feature-detection -and-ranging components are fed to this component, which implements a particle filter- based localization algorithm. In the theory of particle-filter-based robot localization, the key notion is that a particle filter produces an approximate probability density function for the position and heading of a robot by use of Monte Carlo techniques. This theory makes it possible to incorporate knowledge of measurement uncertainty in a rigorous manner. Because the open source particle-filter-based localization software (developed by Prof. Sebastian Thrun of Stanford University) used in a prototype of the system is based on a planar model, the input data fed to this component are preprocessed into simulated single-axis range data (in effect, simulated LIDAR range data) by projecting all 3D features onto a horizontal plane and sampling the field of view at small angular intervals (1°).
Notwithstanding the oversimplification inherent in this approach, success in localization has been achieved in initial experiments.
This work was done by Michael McHenry, Yang Cheng, and Larry Matthies of Caltech for NASA’s Jet Propulsion Laboratory.
In accordance with Public Law 96-517, the contractor has elected to retain title to this invention. Inquiries concerning rights for its commercial use should be addressed to:
Innovative Technology Assets Management
Mail Stop 202-233
4800 Oak Grove Drive
Pasadena, CA 91109-8099
Refer to NPO-41881, volume and number of this NASA Tech Briefs issue, and the page number.