The surge in content generated by cameras, across consumer and industrial sectors, has placed a burden on machines’ capacity to effectively acquire, process, and utilize visual data in a practical and efficient manner. The current challenges include: an overwhelming amount of data being collected (much of which is irrelevant for machines); insuffcient processing capabilities (especially in applications constrained by size and power): and the demand for real-time processing. Consequently, developers of vision-enabled systems — spanning from smartphones, wearables, smart homes, IoT, systems, automotive technologies to industrial automation equipment — are seeking ways to transform the traditional approach to vision sensing and data acquisition.
Having originated in providing images for human consumption, camera technology’s progress over its history — primarily relying on frame-based methods — is proving inadequate in meeting the requirements of modern machine vision. For years, machine vision has been reliant on visual information acquired and structured for human interpretation: video streams composed of sequential images captured by an image sensor. Each image represents a static snapshot at a particular moment lacking dynamic information. This method of gathering visual data is prevalent in most machine vision systems designed for monitoring changes and movements within dynamic environments.
The predominant challenge arises when there is movement or change in a scene, which is common in most machine vision applications, and the inherent limitations of visual frame acquisition become apparent. Regardless of the set frame rate, if a camera attempts to capture a moving scene, it will consistently be inaccurate. Since different parts of a scene typically exhibit varying dynamics simultaneously, employing a single sampling rate to regulate pixel exposure across an imaging array inevitably results in inadequate capture of these diverse scene dynamics occurring concurrently.
Less Is More When Sensing Events
Compounding this challenge, the issues with traditional image sensors are they are slow and energy-intensive while producing excessive redundant data and have limited dynamic range, which makes them ill-suited for machine vision tasks, particularly those in demanding operating environments. Consequently, biologically inspired “neuromorphic” event-based vision systems are now emerging as alternatives that offer enhanced speed, minimal latency, better power efficiency, and broader dynamic range that cater well to various machine vision applications.
Event-based vision marks a paradigm shift in how visual information is acquired and processed for modern machine vision uses. Utilizing neuromorphic techniques inspired by the human vision system, this approach seeks to enhance efficiency and performance in various vision-enabled systems across consumer, industrial, automotive, and other sectors to elevate safety, productivity, and user experience.
Event-based vision operates differently from traditional cameras as it is a departure from a uniform acquisition rate for all pixels. Instead, each pixel independently determines its sampling timing based on light incident changes thanks to dedicated intelligence per pixel. Contrast detection information is encapsulated in ’events,’ comprising the pixel’s x,y coordinates and precise event generation time. With Prophesee’s patented event-based sensors, for example, pixels activate intelligently upon detecting contrast changes (motion), facilitating the continuous capture of essential motion details at the pixel level.
The difference in moving from fixed frame rates is how each pixel can adjust its sampling rate according to its visual input. This personalized approach allows each pixel to determine its sampling points by reacting to variations in incident light levels. Consequently, the sampling process is no longer dictated by an artificial timing source but rather by the signal itself or specifically by temporal signal amplitude fluctuations. The outcome produced by such cameras evolves from image sequences into a continual stream of individual pixel data generated conditionally based on scene dynamics.
Event sensors offer several advantages, including high-speed operation (equivalent to 10,000 fps), highly efficient power consumption (down to the microwatt range), low latency for quicker response times, reduced data processing needs (10-10,000x less than frame-based systems), and a high dynamic range of up to 120dB. These features make event sensors suitable for various applications and products.
Applying Event-Based Vision
Initially, neuromorphic event sensors found commercial use not in machines but for humans, for vision restoration in visually impaired individuals. This led to use cases in industrial automation and process monitoring. These uses demonstrated the benefits of event sensors to numerous vision tasks, especially those involving fast-moving and changing elements, unpredictable ambient lighting conditions, and limited resources. Subsequent generations of event-based systems have been applied in industrial settings for tasks like high-speed counting, preventative maintenance (e.g., vibration monitoring), enhancing robotic efficiency and safety, eye-tracking or gesture tracking for AR/VR as well as various logistics and safety/security applications.
These inherent advantages make event sensors ideal for IoT applications. Power consumption plays a critical role in IoT devices, particularly those relying on batteries. Event-based vision is well-suited for such scenarios as it operates at significantly lower power levels compared to frame-based camera systems. Moreover, event-based cameras excel in challenging lighting conditions common in many IoT applications due to their light-independent information processing. Their high dynamic range allows them to capture a wide range of light intensities within a single frame, making them perfect for environments with varying lighting conditions like outdoor scenes with bright sunlight or nighttime settings.
With a dynamic range exceeding 120dB, event-based cameras can function effectively even in environments where traditional cameras struggle with varying lighting conditions — be it extremely bright settings like public spaces or vehicles during the day or dimly lit scenarios such as nighttime operations or dark factory settings. Furthermore, these cameras offer minimal latency by transmitting information only when there is a change in brightness within the scene. Real-time response proves advantageous in swiftly changing lighting situations, like abrupt shifts from light to dark or vice versa. Event-based cameras, which detect individual alterations in light intensity, are less prone to motion blur compared to conventional frame-based cameras.
This feature is particularly valuable in scenarios involving rapid movements, ensuring sharp image quality. New uses that take advantage of this benefit are being developed for cameras in smart-phones, for example Prophesee’s partnership with Qualcomm to integrate its event-based technology with the popular Snapdragon platform.
Further development in event sensors for IoT involves adapting them for edge vision tasks with limited onboard computing capabilities due to acquiring sparse data. However, challenges such as unconventional data formats, variable data rates, and non-standard interfaces have hindered broader adoption. To address this issue, the latest generation of event sensors, exemplified by Prophesee’s GenX320, aims to enhance integration and usability in embedded edge vision systems by incorporating features like event data pre-processing and formatting, compatible data interfaces, and low-latency connectivity with various processing platforms including energy-efficient neuromorphic processors. For instance, the GenX320 offers multiple pre-processing functions, adaptable interfaces, and power management options to cater to power-sensitive vision applications efficiently.
Despite their operational efficiency, optimizing event sensors for low-power usage suitable for IoT setups remains critical. Implementing a range of power modes and application-specific operation modes can enhance energy efficiency significantly for ’always on’ applications. Utilizing on-chip intelligent power management mechanisms and strategies can further refine sensor flexibility and usability; Prophesee’s solutions have demonstrated reduced power consumption down to 36uW with smart wake-on-events functionality enabled. Additionally supporting deep sleep and standby modes can be beneficial.
Specific considerations for an event sensor targeting IoT applications include achieving microsecond resolution time-stamping of events with minimal latency along with seamless interfacing capabilities with standard SoCs through integrated event data preprocessing functions. Leveraging MIPI or CPI output interfaces ensures swift connectivity with embedded processing platforms such as low-power microcontrollers and modern neuromorphic processor architectures. Sensor-level privacy is ensured through the sparse frameless event data of event sensors and includes static scene removal.
Event-based sensors are now being utilized in a broader range of applications. By integrating these sensors with IoT platforms, product developers meet specific market needs related to power consumption and size. Use cases include foveated rendering for enhanced AR/VR experiences, eye tracking for human-machine interfaces and safety applications like driver monitoring systems and emotion detection. They also support always-on capabilities for security purposes such as fall detection cameras and gesture/ hand tracking for immersive interfaces. In the AR/VR domain, applications like inside-out tracking and constellation tracking based on flickering LCDs enable precise object or controller tracking.
Further new use cases, enabled by enhancement in silicon technology, are under development, including high-speed structured light 3D technology that enables point cloud generation at kilohertz repetition rates for industrial applications. Privacy-conscious smart home systems like fall detection units are also proliferating more broadly as the vision technology address privacy concerns by not capturing or transmitting images.
Event-based vision is well on its way to establishing itself as a paradigm that will create a new standard in many markets requiring efficiency in how machines can see. Over the past several years, it has successfully evolved to meet a wider range of uses. And by continuing to adapt and address the requirements of many applications, we will see more event-based cameras all around us.
This article was written by Luca Verre, CEO and Co-Founder, Prophesee (Paris, France). For more information, visit here .