Three-dimensional (3D) imaging applications are used in many different industries ranging from industrial pick and place, palletization/depalletization, warehouse, robotics, and metrology to consumer-based products such as drones, safety and security, and patient monitoring. No one specific type of 3D technology can address all of these different types of applications — the features of each must be compared as to their suitability.

3D imaging systems can be classified into passive and active. In passive systems, ambient or broad fixed illumination is trained on the object. Active systems use various methods for spatially or temporally modulating light, including laser line scanning, speckle projection, fringe pattern projection, or time of flight (ToF) scan. In both passive and active 3D imaging systems, reflected light from the object being illuminated is captured, often by a CMOS-based camera, to generate a depth map of the structure, and then a 3D model, of the object.

Passive Stereo

Depth map of different size cardboard boxes. Distance to camera is 1.5m.

The standard passive stereo vision system uses two cameras located a fixed distance apart (known as the baseline) to capture a scene from two different positions. Using triangulation, depth information is extracted by matching features in both images. The closer an object is, the more its features will be shifted laterally apart in the two images. The magnitude of this shift, called the disparity, is used to calculate the distance of the matched features.

A 3D triangulation-based system can also be implemented using a single camera. Instead of using pairs of fixed stereo cameras, a single camera mounted on a robot can be moved to different positions around an object. By mapping multiple images using feature-extraction algorithms, 3D images can be re-created using specialized calibration techniques. The system accuracy will be limited to the positional accuracy of the robot.

Structured Light

Sony's DepthSense ToF IMX556PLR sensor.

Structured light systems project patterned light onto the objects. Instead of searching for features that may be hard to see or even non-existent, the cameras need only locate the well-defined patterns of light created by a light projector. Structured light systems typically fall into one of two broad categories: active stereo systems or calibrated projector systems.

Active stereo systems operate in a manner very similar to passive stereo systems with the exception that artificial texture is projected onto the objects. The projected texture can be created by various means such as conventional reticule projectors with LED backlight, lasers with diffractive optic patterns, or laser speckle generated from diffusers. The projected pattern is captured by a stereo camera system and feature-matching is obtained by digital image correlation in the same way as with passive stereo. Measurement accuracy will be limited by the inherent accuracy of the stereo camera in addition to the resolution of the projected pattern.

A second category of structured light systems makes use of calibrated projected patterns. Instead of just projecting texture for stereo correlation, the calibrated projector's pattern is accurately known and forms an integral part of the 3D measurement. Only a single camera is needed to compute depth from triangulation since the projector forms a known vertex on the triangle, analogous to the function of the second camera in a stereo camera pair.

Time of Flight

LUCID Vision Labs’ Helios ToF 3D camera.

While surface height resolutions of better than 10 μm are achievable using laser scanners at short working distances, other applications demand longer range. For example, applications such as navigation, people-monitoring, obstacle-avoidance, and mobile robots require working distances of several meters. In such applications, it is often simply necessary to understand if an object is present and to measure its position within a few centimeters.

Other applications such as automated materials-handling systems, operate at moderate distances of 1–3 meters and require more accurate measurements of about 1–5 mm. For such applications, time-of-flight (ToF) imaging can be a competitive solution. ToF systems operate by measuring the time it takes for light emitted from the device to reflect off objects in the scene and return to the sensor for each point of the image.

For applications such as automated vehicle guidance, LIDAR scanners can be used to produce a map of their surroundings by emitting a laser pulse that is scanned across the device's field of view (FOV) using a moving mirror. The emitted light is reflected off objects back to the laser scanner's receiver. This returned information contains both the reflectivity of the object (the attenuation of the signal) and time delay information, which are used to calculate the depth through ToF.

Pulsing or Waving

ToF cameras use one of two techniques: pulse-modulation (direct ToF) or continuous wave (CW) modulation. Direct ToF involves emitting a short pulse of light and measuring the time it takes to return to the camera. CW measurement techniques emit a continuous signal and calculate the phase difference between the emitted and returning light waves, which is proportional to the distance to the object

Phase-based devices are available from several companies including Texas Instruments and Panasonic, with one of the newest sensors from Sony Semiconductor Solutions (Tokyo, Japan). Its SoftKinetics technology features a current assisted photonics demodulation (CAPD) pixel structure capable of high-speed sampling with high efficiency. This ToF pixel technology is combined with Sony's backside illuminated sensor (BSI) technology to create the new DepthSense ToF sensor. The BSI technology provides better light collection efficiency in NIR wavelengths.

Point Cloud of cardboard boxes in larger cardboard box. Distance to camera is 1.5m.

An example of how DepthSense can be applied is LUCID Vision Labs’ Helios ToF 3D. The camera can be operated at three working distances using light from four on-board 850 nm VCSEL laser diodes modulated at different frequencies. The camera has a 56° × 43° field of view and an operating range of 0.3 to 6m. At its highest modulation frequency of 100 MHz the camera has 2.5 mm precision and 5 mm accuracy over its 0.3 to 1.5 m working range. The camera performs on-board processing, producing 3D point cloud data that can be read directly from the device instead of having to be calculated off-chip. The point cloud data can also be transferred over the camera's GigE interface, which allows users to create their own custom software in C, C++, or C#.

In the past, camera vendors needed to supply drivers to their customers so that they could properly be configured and controlled by a host computer. With the introduction of the GenICam generic programming interface for machine vision cameras, the camera interface can be decoupled from the user's API, alleviating the need to write multiple camera drivers. Now, with the release of GenICam 3.0, depth and amplitude data or processed point data can be read from cameras without any further data conversion.

Which 3D Technology Is Right for You?

No 3D imaging technology meets the needs of every application. When choosing between an active or passive imaging system, several factors must be considered, such as whether any additional light sources are present, what types of surfaces need to be imaged, and whether any objects in the field of view are specular or will produce multiple reflections.

Light sources such as sunlight or specular reflections off shiny objects can easily saturate cameras used in stereo vision systems. Conversely, low light levels can produce noisy results. For low lighting environments, slightly more expensive active methods including ToF can be used. Applications such as metrology that require micron-level precision measurements may need active laser systems placed relatively close to the object while other active laser systems such as scanning LiDAR can achieve much larger distances, although with lower absolute accuracy.

With ToF technology, only a single camera is needed and no calibration is required by the developer. Second, the system is much less affected by adverse lighting conditions compared to traditional passive stereo. Third, the camera can output point cloud data directly, offloading processing from the host PC. Last, the system is less expensive than high-performance active laser systems and comparable with projected laser light stereo systems.

The Future for ToF 3D Cameras

3D ToF cameras can be used in systems that previously used stereo cameras, for example, robotic pick and place machines. This would reduce the size and weight of the system and directly outputting point cloud data would reduce the time required for processing to recognize and localize objects. For applications such as materials inspection that previously used pattern projectors to illuminate areas with little or no surface features, such a camera could be used because its ToF technology only requires that objects reflect IR light — it does not require texture for correspondence matching.

This article was written by Jenson Chang, Product Marketing Manager, LUCID Vision Labs, Inc., (Richmond B.C., Canada). For more information, contact Mr. Chang at This email address is being protected from spambots. You need JavaScript enabled to view it. or visit here.