Machine Vision Fundamentals: How to Make Robots ‘See’
- Created: Wednesday, 01 June 2011
- Template matching: A template-matching tool is shown and trained on one or more images of the item of interest, like the round clips of the assembly in Figure 1 and 2. It may learn the entire image of the part, or certain features such as the geometry of the edges. During operation, the technology searches the field of view for a nearmatch to what it “learned.” There are various images and mathematical processing methods (such as normalized correlation) to accomplish each. Those based on edge geometries offer advantages for partially occluded objects or a scale-invariance option when the camera’s distance from the object is variable. When the degree of match exceeds a minimum threshold, the object is “kept.” Figure 2 shows these results, where the software tool has found two clips that met the matched criteria and marked them with yellow rectangles.
- Differentiation based on brightness: This method includes determining a brightness “threshold” on a gray scale image such that everything above or below that value is the object of interest (i.e. light objects on a dark background or vice versa). Most commonly this is a value between 0 and 255 corresponding to the 256 levels available in 8 bit coding for each pixel. The threshold value may be fixed, or it may adapt to varying light levels via a simple (average gray level) or complex (histogram-based) algorithm. The threshold is applied to the image, separating the object(s) of interest.
- Differentiation based on color: Color is best addressed by transforming each pixel of the image to “distance” from the trained color sample set in 3-axis color space. Color representation methods usually characterize a color by 3 coefficients. RGB (Red, Green, Blue) is common and native to most imaging and display processes. Triplets of coefficients require a three dimensional graph called a “color space.” R, G, B are each located on an axis orthogonal to each other, for example. “Distance” between the points representing two colors in this space is the three dimensional Pythagorean distance ([(R1 - R2)2 + (G1 - G2)2 + (B1- B2)2].5) between them. The “trained color” can be that of either the desired object or the background.
Figure 3 shows the original image before a color tool is trained on the shades of red present in the screwdriver handle. Execution on the color image transforms it into the synthetic image shown in Figure 4. The shade of each pixel represents the distance in 3D color space (the closeness of the match) to the trained color. Finally, in Figure 5, the handle is uniquely objectified using thresholding, marking the object green on the display.
- Differentiation based on height: This technique is used on images where the third dimension is scanned and coded into the pixel values as previously described. This synthetic image may then be processed in the same manner.
All methods may retain multiple eligible objects, in which cases a choice will need to be made between them on some additional criteria, such as “first in line.”
#3 Determine the position and orientation of the object: For our example, the results of this stage are the x and y co-ordinates of the object and the angle of its orientation. Sometimes this function is performed as part of the previous “find” procedure. For example, a template-match tool might supply position and orientation data on the part which it has located. The addition of simple software tools that provide feature computation or geometric analysis will generally complete this task.
#4 Translate the information to the coordinate system of the robot: The vision system and the robot each innately have their own co-ordinate system to represent location, an orthogonal “x” and “y.” To communicate to the robot, one must translate to/from the other; this is usually handled by the vision system.
Besides permanent innate differences, other small errors may get introduced. A simple addition or subtraction of a correction factor from the x and y values can provide first order correction and translation for these factors. A tool designed for this purpose operates in two modes: a “learn/calibrate” mode (where the robot may be stopped with a target on it in view of the camera) and a run mode when the correction or translation is applied. X and y offsets between the systems are set during a calibration sequence, and applied to the measurement during running.