In this work, there were four independent vision modules implemented in an FPGA: a CameraLink camera interface, rectification, bilateral filtering, and stereo disparity correlation. Each module was originally designed to run from end to end, not in a pipeline with the other modules. This limited throughput to 3.75 Hz.
The specifications are 15 frames per second and 1024 × 768 resolution, using stereo images to produce stereo disparity maps. A system was built in an FPGA that would handle all four of the steps necessary to compute stereo disparity. Previous work solved the problems of rectification, correlation, and filtering; this work focuses on combining those systems into a single FPGA as a single pipeline. Images were captured at 15 Hz from the CameraLink system. It would then be rectified, filtered, and correlated against without interaction from the CPU. When complete, an interrupt would be sent over the PCI bus to the CPU so the output disparity image could be read over the PCI bus.
The CameraLink library provided by the board vendor was used. Customizing the library allowed input images to stream directly into the next module (rectification) as the pixels were received. Rectification handles the de-warping of the input image so that correlation can be done upon it. A bilateral filter operates on both input images simultaneously once the rectification has finished, creating warped images. The correlation function produces stereo disparity data.
The primary improvements to each step were in the interfaces between the modules. The control logic that drives each module was largely re-written to take each module from being standalone to running in series as an end-to-end pipeline. At the software level, only the left warped image and the disparity image were needed, not left and right images and the disparity image. In order to achieve the desired throughput speed, the left warped image could be read out while it was being fed into the bilateral filter. Because both the bilateral filter and the image transfer operations were occurring simultaneously, 10 ms per frame was saved. The bilateral filter operates in a slave mode, meaning it is fed data rather than fetching data. This is to allow the PCI bus to read the left warped image out of SRAM, and feed the bilateral filter at the same time.
The stereo correlator was configured to read data streaming from the bilateral filter, instead of reading it from SRAM. C++ code was written to handle the timing and control of the system. This is the device driver layer that handles starting the FPGA, and reading the left warped image and the disparity image into main memory for obstacle detection. Five independent SRAM banks on the FPGA board are used to handle the parallel processing of data.
The visual algorithms needed to compute stereo disparity on large imagery can now be done in an embedded system that acts as a preprocessor before the CPU ever sees the imagery. This is useful for satellites as well as for descent imagery that needs to be very fast, such as during entry, descent, and landing. It is also applicable for future rover missions so the navigation imagery can be preprocessed, unloading the processor for other tasks. This will be used to great effect soon, as this work has been repackaged as part of the “Fast Traverse” system, a flagship baseline functionality of the Mars 2020 rover.