Human vision gives us a huge evolutionary advantage, at the cost of sustaining a brain powerful enough to interpret the vast amount of data it produces. Evolution’s frugal nature therefore led to the emergence of shortcuts in the visual-processing centres of our brains to cope with this data deluge. The photoreceptors in our eyes only report back to the brain when they detect a change in some feature of the visual scene, such as its contrast or luminance. Evolutionarily, it is far more important for us to be able to concentrate on the movement of a predator within a scene than to take repeated, indiscriminate inventories of the scene’s every detail. Recent research on human’s ability to recognize objects suggests that humans can gather useful data from a scene that is changing at rates of up to 1,000 times a second – a far higher rate than the 24, 30 or 60 frame/s that we use to represent movement on television or in movies.
A huge amount of useful information is encoded in these changes, which most fixed frame-rate cameras never even see due to their low sampling rates. Event-based sensing doesn’t use a fixed frame rate but instead relies upon each pixel to only report what it sees when it senses a significant change in its field of view. This approach reduces the amount of redundant data transmitted by the sensor, saving processing power, bandwidth, memory and energy. It enables sensors to be built with much higher dynamic ranges than is usually the case, because each pixel automatically adapts to the incident light. For this reason, event-based sensors aren’t restricted by high contrast in the scene such as a car’s headlights at night, in the way that a conventional sensor would be. And, event-based sensors allow to cost-effectively record events, that would otherwise require conventional cameras running at up to tens of thousands of frames per second.

The Prophesee event-based sensor has its output designed as a time-continuous data stream that represents a visual event as a sequence of addresses for each pixel that senses it. This spatio-temporal data stream provides a more direct way of representing dynamics and motion in the event sensor’s field of view than inferring it from frame-to-frame processing of a standard sensor’s output. These characteristics create opportunities to rethink today’s imaging and machine-vision data processing, and to address emerging computer-vision strategies, such as machine learning, in a new way.
Prophesee’s event-based approach to vision sensing means that vision system designers who want to capture fast events no longer need to make a tradeoff between running their cameras at high frame rates and dealing with large amounts of redundant data. The volume of data the sensor produces is now governed by the activity in its field of view, automatically adjusting as the scene conditions evolve. Looking at a static scene will generate no events, but if there is a burst of action, the camera adapts automatically to capture it instantly. This makes it easier and more cost effective to acquire and analyze very fast motion, even if it is interleaved with times or areas in which motion is absent.
Each pixel provides information at the rate of change in its field of view, not at an arbitrary, preset and fixed frame rate. An event-based approach also means that dynamic scenes can be analyzed as a highly resolved sequence of events that form spatio-temporal patterns representing features such as the edges, trajectories, or velocities of objects. The mathematics describing such features in space and time is simple and elegant and so yields efficient algorithms and computational rules. In one comparison, Prophesee’s event-based approach to dynamic sensing achieved temporal resolutions of tens of kHz where a frame-based approach struggled to reach 60Hz.
This is possible because each visual event is handled as an incremental change to a continuous signal, which can be analyzed at low computational cost, when compared to the simultaneous analysis of all the pixels in many complete frames. An event-based approach also makes it easier to correlate multiple views of a scene. This eases tasks such as 3D depth reconstruction in multi-camera stereoscopy set-ups, because if two or more cameras sense an event at once, it is likely they are observing the same point. Also, analyzing the way in which the illuminance of a single pixel changes over time enables the development of new ways to solve key vision challenges, such as object recognition, obstacle avoidance, and the simultaneous localization and mapping processes that are vital to enabling vehicle autonomy.
A New Dimension in Machine Learning

Event-based sensing enables new approaches to machine learning. An object recognition or detection algorithm that, until now, could only use the spatial information from a frame can now access another dimension: time. For example, in a frame-based representation, a plastic bag and a rock lying in the road in front of a car may look a lot like each other but their dynamics, as captured by an event-based sensor, would clearly distinguish them. Event-based sensors can also provide information that is difficult or impossible to derive using frames.
Where a frame-based camera only sees a snapshot of the world, an event-based camera can see the oscillations of the arms and legs of a pedestrian, the rapidly rotating wheels of a car or a cyclist pedaling her bicycle as a distinct signal. With GPUs and other machine learning platforms focusing on processing growing volumes of data to increase their performance, the temporal information provided by event-based sensors could enable a complete shift in the inner properties used to recognize objects.
This may either, enable better generalization of learning (i.e. dramatically reducing the size of the datasets required for efficient machine learning) or, enable the capture of more subtle aspects of a scene such as the intent of a pedestrian waiting at a crossing, which can be read from slight changes in posture that a frame-based camera would not see.
Improved Robustness

In parallel with the advantages brought by the asynchrony and temporal resolution of the sensor, its large dynamic range and logarithmic sampling of the scene make many computer vision algorithms more robust. One of the big challenges of computer vision is the uncontrolled nature of outdoor lighting.
Algorithms that perform very well in lab conditions may be much less effective when blinded by the sun or missing details hidden in shadows. In such conditions, the event-based sensor’s sensitivity to changes in illuminance means that the way it perceives events does not change in low light, bright light, or very high dynamic range scenes. This means event-based sensors are well suited to making seamless transitions from indoor to outdoor conditions a particularly useful facility in, for example, autonomous driving systems that have to cope with the vehicles to which they are fitted moving into or out of tunnels.
An AMD Industry First
In May, Prophesee partnered with AMD to make its Metavision HD sensor available for use with the AMD Kria KV260 Vision AI Starter Kit. It marks the industry’s first event-based vision development kit compatible with an AMD platform, providing customers a platform to both evaluate and go to production with an industrial-grade solution for target applications such as smart city and machine vision, security cameras, retail analytics, and many others.
The development platform for the AMD Kria K26 System-on-Module (SOM), the KV260 Vision AI starter kit is built for advanced vision application development without requiring complex hardware design knowledge or FPGA programming skills. AMD Kria SOMs for edge AI applications provide a production-ready, energy-efficient FPGA-based device with enough I/O to speed up vision and robotics tasks at an affordable price point. Combined with the Prophesee breakthrough event-based vision technology, machine vision system developers can leverage the lower latency and lower power capabilities of the Metavision platform to experiment and create more efficient, and in many cases not previously possible, applications compared to traditional frame-based vision sensing approaches.

A breakthrough plug-and-play Active Markers Tracking application is included in the new kit. It allows for >1,000Hz 3D pose estimation, with complete background rejection at pixel level while providing extreme robustness to challenging lighting conditions. This application highlights unique features of Prophesee’s event-based Metavision technologies, enabling a new range of ultra high-speed tracking use cases such as game controller tracking, construction site safety, heavy load anti-sway systems and many more.
“The ever-expanding Kria ecosystem helps make motion capture, connectivity, and edge AI applications more accessible to roboticists and developers,” said Chetan Khona, Senior Director of Industrial, Vision, Healthcare and Sciences Markets, AMD. “Prophesee Event-based Vision offers unique advantages for machine vision applications. Its low data consumption translates into efficient energy consumption, less compute and memory needed, and fast response times.”
This article was provided by materials supplied by Prophesee (Paris, France). It has been edited. For more information, visit here .