Reducing Drift in Stereo Visual Odometry
- Created: Friday, 28 February 2014
- NASA’s Jet Propulsion Laboratory, Pasadena, California
The drift was reduced from an uncorrected 47 cm to just 7 cm.
Visual odometry (VO) refers to the estimation of vehicle motion using onboard cameras. A common mode of operation utilizes stereovision to tri angulate a set of image features, track these over time, and infer vehicle motion by computing the apparent point cloud motion with respect to the cameras. It has been observed that stereo VO is subject to drift over time.
It is well known that stereo triangulation suffers from bias induced from correlation error (i.e., subpixel errors in matching between features in the left and right images of a stereo pair). The nature of this bias is a complicated function of the error statistics of the correlation and depends on scene structure, camera configuration, and imaging geometry. The goal of this work was to better characterize the stereo bias than has been done to date, and explore the effect of compensating for this bias on VO drift.
The problem was approached in terms of a stochastic propagation of error from the image plane to triangulation. The end result is a better representation of stereo bias than has been accomplished thus far. In early tests, this stereo bias correction has had a dramatic effect in reducing VO drift. In a simulated run of 100 m, the drift was reduced from an uncorrected 47 cm to just 7 cm. The preliminary conclusion is that VO drift is due primarily to stereo bias rather than to inherent bias in the non-linear estimator used to recover motion.
Given a model of noise in correlation matching between the left and right images of a stereo pair, the projection of any point in the world into the stereo pair can be indexed by its pixel location in the left image and the horizontal shift of that location in the right image, referred to as disparity. Thus, every point in the world that is simultaneously visible in both cameras can by described as (x,y,d). Considering only integer values, a lookup table was constructed and indexed by (x,y,d), where (x,y) spans the whole of the left image, and d spans values corresponding to reasonable ranges in the scene. At each (x,y,d), the assumed noise profile is taken from correlation and a Monte Carlo simulation is performed to propagate that noise profile into the triangulated point in space. The results are stored in a table indexed by image column, row, and disparity.
Given any point correspondence, the corresponding (x,y,d) is looked up, interpolated from integer to real values as required, and the triangulation statistics are extracted. The lookup table generation is a one-time process for any calibrated imaging system, and the subsequent lookup has minimal comp utational overhead. From the triangulation statistics, the likely bias is taken and added directly to the triangulated point. In ensemble, this produces a point cloud with higher fidelity than the uncorrected cloud and, consequently, a better incremental VO estimate.
VO bias compensation has focused primarily on correcting the motion estimate rather than on correcting the stereo. This method can be implemented in real-time on a deployed system and shows very promising preliminary results.