FLIR Systems has introduced their Firefly machine vision camera with open platform deep learning inference onboard. Deep learning makes it possible to easily develop high performance solutions for difficult vision problems.
The embedded Intel Movidius Myriad 2 Vision Processing Unit (VPU) makes it easy to get started with deep learning for machine vision applications by enabling inference on the edge. With inference on the edge computing, rather than capturing the image data and sending it to a central server for processing and decision making, most of
this work is done in the camera by means of a pretrained neural network. A central server or cloud platform can still be part of the network topology, but with decision-making happening at the edge, it would only be used to aggregate statistics.
The advantages of inference on the edge include:
Decreased latency. Since decisions are made locally, there is no need to wait for image frames to be transmitted to the server, processed and the “answer” transmitted back.
Decreased bandwidth. Since images are large, transmitting them takes a lot of bandwidth. By only transmitting the “answers” to a server for statistics, far less bandwidth is needed.
Increased reliability. Since decision-making is done on the edge independent of a central server, the system can operate offline.
Increased security and privacy. The small amount of data that is transmitted can easily be anonymized and encrypted.
Because of its 19 × 19 × 12 mm size, 20g weight, and 1.5 W nominal power consumption, it is suited for embedding into compact designs, and battery-powered devices. With these features, it can be used as a handheld device, in an intelligent transportation system above a highway, in a flying drone and all sorts of autonomous vehicles, as well as anywhere that video inspection is conducted.
Firefly's open platform enables users to train and deploy neural networks for the camera using a range of frameworks and tools built by companies like Google, Amazon, Nvidia, and Intel. This frees users from the expense of development and runtime licenses and makes it easy to keep pace with the most current frameworks and highest performing tools for their applications. There are also large and rapidly growing collections of pre-trained networks — many of which are available on a website called Model Zoo — to use as starting points for developing application-specific networks.
Sony Pregius global shutter CMOS sensors are used to ensure clear, distortion-free images of moving objects, even in low light. They have low read-noise and high quantum efficiency for both visible and NIR light, making them suitable for use in a wide range of industrial and biomedical applications.
Deep Learning Versus Inference
Deep learning and Inference are closely related but distinct from each other. Deep learning is the process by which deep neural networks are trained over many iterations of testing a model against training images. These results are then used to refine the model. Inference, on the other hand, is the use of an already trained network to make predictions on novel images. Inference is how deep learning is used to find answers to real-world problems.
A pre-trained neural network can be loaded into the Firefly using the Intel Neural Compute Stick SDK. Images captured by the camera can then be used as input to the neural network, which makes predictions based on those images. The Google MobileNets open-source family of high accuracy, small, computationally efficient, networks can then be used as a starting network structure for developing Firefly applications.
While the Firefly cannot retrain its own network on-camera, it has several features that facilitate ongoing refinement. For example, the confidence interval of each prediction is provided by the camera as GenICam chunk data (metadata associated with each image) and can be used to identify those images that have low-confidence results. The low-confidence images can then be saved to the host system for further analysis. Once labeled, they can be included in future training datasets.
Having the Movidius Myriad 2 VPU on board the Firefly eliminates the need for a separate PCIe or USB Myriad 2 adapter to add inference. The USB 3.1 Gen 1 interface, USB3 Vision protocol, and GenICam support ensure compatibility with a range of off-the-shelf hardware and software. Four bi-directional GPIO pins enable camera triggering and synchronization of external hardware such as lighting. Inference results can also be indicated by GPIO signaling. One object class can be assigned to each GPIO pin, which will change state once that class has been recognized and a user-specified confidence interval has been met.
Classification for Qualitative Inspection
Traditional rules-based software is ideal for straightforward tasks such as barcode reading or checking a manufactured part against specifications. But inspection systems using inference-based object classification can answer much more subjective questions. For example, a network could be trained to differentiate between food produce that is or is not export grade. The final determination of whether a piece of produce is of acceptable quality depends on a combination of its size, shape, color, and uniformity. The large amount of variation within each of these criteria, and the way they combine to result in a final determination is very challenging for traditional methods.
High accuracy, subjective quality inspection can be achieved by using a network trained for classification. FLIR has demonstrated an example of a subjective quality inspection of painted camera cases using the Firefly in a stand-alone configuration. Other examples of object classification applications are differentiating between awake and drowsy drivers or between cosmetically scratched vs. defectively cracked solar panels.
Qualitative inspection using inference enables manufacturers to detect process drift much earlier than is possible with manual inspection. Compared to humans, inference-based inspection is far more consistent across multiple inspection stations. With less variance in the inspection criteria across stations, trends can be identified within this variance much more easily, enabling corrective action to be taken earlier. [Figure 3]
Results for object classification can be output over GPIO, enabling the Firefly to operate as a stand-alone inspection system for certain applications. Classification results can also be output as GenICam chunk data. The practical upper limit on the number of classes that firefly can differentiate between is high. Myriad 2 optimized networks, which can differentiate between the 1,000 classes of objects in the ImageNet dataset are available online.
Object Detection and Tracking
Using the Firefly simplifies the ability to add object detection and tracking capabilities to an embedded system such as a drone or robot. Object detection and positioning results can be output as GenICam chunk data in the form of coordinates, or as bounding boxes drawn into each frame. The greater complexity of networks trained for this purpose, however, means the maximum frame rate will be lower than for less complex classification networks.
Firefly technology can be used for object classification, detection, and tracking. When combined with its global shutter CMOS image sensor, these functions enable extremely diverse applications, each of which will have different requirements for inference accuracy and speed. Balancing these parameters by optimizing the structure of the neural network and its training is key to developing a successful inference-based solution. Uses of object detection and tracking include quality inspection of discrete parts, detecting out of stock items on a supermarket shelf, identifying and locating weeds for farm robots, and hazard detection for a UAS collision avoidance system. Firefly can also serve as a trigger for more powerful, but less efficient image acquisition and processing systems. By detecting and localizing objects of interest on a high-efficiency, low-power platform, Firefly can ensure that high-power systems are only called upon when required. In unmanned aerial vehicles, this reduces battery consumption to maximize flight time, while in industrial systems, this saves power and money.
Deep learning is just starting to take off in the machine vision industry, but its impact is already being felt. With new product classes like Firefly, we expect to see current systems being optimized and augmented by deep learning. We are also expecting the emergence of entire new types of applications that were previously impossible with traditional rules-based coding.