Whether you are involved in robotics or other domains, pattern recognition is essential to progress. A new device, called the CogniMem chip, offers a hardware solution to this problem. CogniMem can be considered a true Artificial Intelligence device because it has been designed specifically for the purpose of learning, memorizing and recognizing.
What Are Intelligent Systems?
An intelligent system can get a basic training (instead of programming) to react to its operating environment. It should then be able to adapt and report to fast or slow changes occurring in this environment. Indeed, this implies perception ability. We can postulate that intelligence pertains more to active memories reacting to incoming stimulus than a processor endlessly playing a programmed "recipe".
It was well described in Jeff Hawkins's landmark book, On Intelligence, that an essential difference between the biological approach and the computing approach to intelligence lays in the fact that biology uses active memory cells (neurons) while computers use a procedural activity involving the "fetch, decode and execute" model. Intelligence is also about adaptive learning, whether it is supervised or not. This implies that the memory cells can snoop the response of other cells before making a decision to learn a new model or change the confidence level with which they recognize an existing model. This process has to occur in real-time and not be affected by the number of connected cells
Pattern Recognition Challenges
First, let's describe some of the challenges faced in "real world" image applications or signal identifications. There are at least three:
- Acquisition stability and noise problems;
- Pattern characterization and data reduction process;
- Accuracy and speed of the classifier (decision making component).
Real world patterns are subject to noise and jitter. For example a visual object such as a mug in a scene will have many variations. Homogeneous lighting variation will change the overall contrast in an image. This kind of variation can be dealt with using the well-known normalized correlation. Light reflecting on the shiny cup will actually change the apparent shape and, therefore, cannot be recognized with standard methods. Inspection of "nature made" products is another example of difficult images to deal with. While two herrings can look identical to a human eye, their digitized images might belong to different computed models. Recognizing a large population of these fishes will involve building a statistically viable model base. This model set can be very large and, therefore, a standard computer going through these models sequentially will need hefty computation capability, leading to high consumption, heating issues, and large footprint.
Performance should be defined through speed, footprint, power consumption, real-time learning and non-linear classification capabilities. High frequency clock devices with a single fetch and decode operation do not run parallel processes efficiently. While the concept of trainable neural networks has been known for decades, and a fair amount of software drawbacks exhibited, very few hardware implementations have reached the industry with absolute digital parallelism. Today, the CogniMem chip (CM1K) offers a very efficient alternative to sequential processors such as RISC or DSP's for near sensor image, signal or parameters recognition.
What Is CogniMem Technology?
The CogniMem chip is a fully parallel silicon neural network. In other words, it is a chain of 1024 identical elements called neurons, which are accessed in parallel and have their own "genetic" material to learn and recall patterns at unmatched speed, thanks to a self-contained architecture. The CogniMem chip can be considered as a pattern recognition co-processor or as a companion chip for sensors. Another important fact is that the CogniMem chips can be stacked, allowing designers to store more knowledge without impacting the recognition time, thanks to the parallel architecture. A resulting achievement of this architecture is a constant learning and recognition time regardless of the number of connected neurons. A second achievement is the ability to expand the size of the network at will by cascading chips; connecting "n" CM1K chips through their parallel bus multiplies by "n" the size of the network and the "n" chips behave as a single component with no need for an external controller. Finally the neurons can be partitioned easily so they only react to a given context. This allows training the network to recognize patterns which have nothing in common but can be combined to build a more robust decision based on multiple criteria, such as voice and face recognition for example. While the hardware architecture is very innovative, the neural network (RBF Restricted Coulomb Energy) is one of the most acclaimed for pattern classification.
CM1K allows constant speed matching of an incoming pattern (pixel block, signal slice, parameters vectors, etc...) with previously learned models. Typically an incoming pattern will be evaluated for similarity within 1024 learned patterns or more (multiple CogniMem chips) in 10μseconds at most. The footprint of the CogniMem TQFP100 package is 14mm by 14mm (8 x 8mm die size) with a maximal consumption of 1/2 Watt. Each pattern feed performs up to 300,000 absolute/ accumulations in less than 300 clock cycles. Typically a CogniMem chip will perform 30 Giga absolute/accumulations per second. For a similar pattern recognition performance, at least 90 high level DSP's at 300 MHz would be required. In addition, the real-time learning capability allows applications that could not be envisioned before, such as real-time relearning of targets and instant enrollment in biometry.
Using CogniMem for Real World Applications
As mentioned before, the CM1K chip can be considered a pattern recognition co-processor; it delivers deterministic pattern recognition immediately upon the receipt of a vector data whether it derives from a signal, sound, video or data stream of any source. CogniMem frees programmers from including the data recognition part into the multi-tasking, priority and scheduling management of their RTOS. The recognition occurs autonomously and in real-time within one clock cycle and the classification within 36 clock cycles (i.e. 3 μseconds if the system clock is at 27 MHz).
A typical hardware configuration for the CM1K is to receive the vector data to recognize through a microprocessor (MPU) and/or field programmable gate array chip (FPGA), which itself interfaces with the sensors that deliver the input signals and with the actuators and communication ports that broadcast the final decision, or action, to take. The code on the processor/FPGA varies depending on the complexity of the application. A typical example of a more complex case is to read and interpret the response of all the firing neurons and not just the one with the best match. This can lead to uncertainty management, hypothesis generation, and ultimately usage of a partition of the neural network to make a higher level classification, taking full advantage of the powerful non-linear classifier embedded in the CM1K. Also if an application requires monitoring and making a decision on multiple sensor inputs, the processor and/or FPGA can be used to target different partitions of the neural network where neurons have been trained to recognize patterns coming from different sources. This will generate multiple contextual responses from the different partitions of the network, again to be consolidated and possibly sent to another partition for a final decision. Other applications, like data mining, can justify very large arrays of CogniMem chips leading to millions of parallel neurons, still accessed in thirty-six clock cycles.
We are surrounded by problems that can be solved through pattern recognition, from the obvious recognition of an image, a fingerprint, a barcode, a face and the extension of it that encompasses motion detection, or to the recognition of a sound or a signal, or to researching text. Experts want to store their knowledge and share it. It can be as simple as recognizing plants, birds, and fish or as complex as detecting a tumor on a scan. The precious time of specialists is a must, but once they take the time to train the neurons, the knowledge can be spread. If more is discovered, more can be added to the training.
How can it really live up to the expectation with so little code involved? That's what the team is actively working on demonstrating. Now that the chip has finally materialized, simple tools must be built that will allow end-users to promptly experiment with its capabilities.