Predicting Long-Range Traversability From Short-Range Stereo-Derived Geometry
- Monday, 01 November 2010
Learning-based software improves obstacle avoidance by robotic ground vehicles.
Based only on its appearance in imagery, this program uses close-range 3D terrain analysis to produce training data sufficient to estimate the traversability of terrain beyond 3D sensing range. This approach is called learning from stereo (LFS). In effect, the software transfers knowledge from middle distances, where 3D geometry provides training cues, into the far field where only appearance is available. This is a viable approach because the same obstacle classes, and sometimes the same obstacles, are typically present in the mid-field and the far-field. Learning thus extends the effective look-ahead distance of the sensors.
The baseline navigation software architecture in both the LAGR (Learning Applied to Ground Robotics) and MTP (Mars Technology Program) programs operates so that stereo image pairs are processed into range imagery, which is then converted to local elevation maps on a ground plane grid with cells roughly 20-cm square covering 5 to 10 m in front of the vehicle, depending on camera height and resolution. The image and the map are the two basic coordinate systems used, but only pixels with nonzero stereo disparity can be placed into the map. Geometry-based traversability analysis heuristics are used to produce local, grid-based, ‘traversability-cost’ maps over the local map area, with a real number representing traversability in each map cell. The local elevation and cost maps are accumulated in a global map as the robot drives. Path planning algorithms for local obstacle avoidance and global route planning are applied to the global map. The resulting path is used to derive steering commands sent to the motor controllers.
The software (training set selection, classifier training, and image classification) runs in real time at about 3 Hz on a 2-GHz processor, and the type of “image appearance features” is user-configurable. Basic RGB (red-green-blue) features, or their powers, or separable textures or within-patch color histograms can be used in any combination. All of these methods run in real time. The software can work in two modes: purely online or by using a fixed, previously-learned classifier. To learn the classifier, a cumulative-training mode is built in which training data across an entire run accumulates, learns a model at the end of the run, and saves the model to a reusable configuration file. The cumulative training mode can run alongside the online classification mode. One of two classification modes can be used: A linear discriminant (LDA)-based method, or a linear support vector machine (SVM) classifier.
This work was done by Michael Turmon, Benyang Tang, Andrew Howard, and Max Bajracharya of Caltech for NASA’s Jet Propulsion Laboratory.