Advanced LiDAR — On the Road to SAE Level 3 Partially Automated Driving

Ed Brown

The industry push for more automated driving features heavily relies on advanced sensing, and key to that is LiDAR (light detection and ranging). The closest to commercially available automated driving in the U.S. is ADAS (advanced-driver-assistance systems). These span SAE automated driving levels 0 – 2, which assist drivers but require them to be in control at all times.

LiDAR combines the best features of two of the other sensing technologies: radar and cameras, and overcomes their drawbacks. For example, it has higher resolution than radar and unlike cameras, it is not sensitive to ambient light.

A major challenge for automotive driving sensors is to provide accurate, real-time, minimal- latency sensing of a vehicle’s surroundings while it is traveling at high speed. Even congested urban environments are not as challenging because of lower speeds, typically less than 40 miles per hour.

LiDAR is intrinsically well suited for meeting this challenge, because it is based on measuring the time for a transmitted laser pulse to return to a receiver. Since the speed of light is a constant, the resolution of a LiDAR sensor is limited only by the wavelength of the light being used — in the range of one or two millimeters.

We recently spoke to Sumit Sharma, CEO of MicroVision about their new MAVIN™ DR LiDAR system, which is aimed at helping automotive manufacturers improve their ADAS systems and eventually move toward SAE Level 3 partially automated driving. A key aspect of this system is that it combines short-, medium-, and long-range sensing into one package.

Tech Briefs: Could you say something about how this new LiDAR system is different from other automotive LiDAR systems?

Sumit Sharma: Let's first look at the problem that we have to solve and then I will compare our product with the others.

The question here is can we make every car safer than with a human driver. I'm over 49 years old and I've never been in an accident so I would consider myself a pretty safe driver. It would be a tall order for most of us who are safe drivers to make a car even safer. However, I can also say that in the course of my driving history there have been times that I've come pretty close, although I pulled it off from the edge — I didn't get hurt.

But some people may not be as lucky as me. It all comes down to our response time and our visual acuity — we have to pay attention: we have to assess what's happening, we have to assess other’s velocities, we have to look at weather conditions. We have to process all of that and be cognitively present in the moment at all times.

As to cognition, the advantage of computers is that they are always on, they're going to focus on whatever task is given. The computer can be trained to do those tasks that we automatically do in driving.

When a computer is fully tasked with just one set of instructions, it moves faster than the human brain. Therefore, the response time to take action on an event would not be last-minute yanking on the wheel to get out of trouble, it would be more controlled.

So as an analogy, think about Lewis Hamilton, the formula one driver. Because he's trained to be a world class driver, he can identify dangers much faster than most drivers — his reflexes are significantly better — so within less than 180 milliseconds, he can navigate away. Whereas even an above average driver may take 300 milliseconds, and that's the difference between being safe all the time versus being able to pull back from the brink of having an accident.

So, what's needed? What is my brain doing and what do I need to teach the computer to be able to do?

The computer has to be able to recognize or perceive what the scene is, what's happening within its field of view, very very quickly. It has to respond whether I'm driving on a six-lane highway in Los Angeles, for example, or a two-lane highway with opposing traffic, which is where most traffic accidents happen on a country road.

And for that we need several things. Number one, you have to digitize the entire scene — a camera module would have to take a series of pictures in succession, so it's like a movie. Those pictures then have to be converted to some sort of 3D map for understanding the depth and velocity of objects as they're moving.

But a LiDAR is digital to begin with — its first step is to digitize the scene. So, you save a lot of time there. However, you need significantly higher resolution than in is now available with most of the LiDARs now on the market. You need resolution equivalent to your eyesight, and you need it to be digital. You also need low latency, 24 – 30 Hz LiDAR streaming and 100 milliseconds for tracking and planning, during which everything is perceived and classified, and a maneuver is planned.

What our dynamic view LiDAR effectively does, is it looks at the dynamic ranges, near-field mid-field, and far-field. Like the high resolution in a person’s foveal view, it does that all through the three fields of view.

Let me elaborate a little bit about why this is important. Let's say you and I were sitting across the table and talking and you're one or two meters away from me. My focus would be on you — everything behind you and on your periphery is out of focus in my brain. Anytime I want to bring something into focus, I have to turn my head.

In the case of our dynamic view LiDAR, the wide field of view has high resolution. As we look further out, we collapse the field of view but continue to have high resolution.

In our system, all three fields of view are happening simultaneously, so there is no assessing where you have to look to refocus. There is no particular area of interest — the dynamic LiDAR continuously looks at near-, mid-, and far-distances and digitizes the entire field of view at very low latency.

So, that's the biggest differentiation between our system and others: you would have a device that can give you high resolution at range and at low latency.

Tech Briefs: Does your LiDAR do information processing?

Sumit Sharma: Our MicroVision MAVIN solution combines LiDAR hardware and perception software. The hardware includes a microcontroller, DSP, and memory, onboard. Our custom digital ASIC allows us to perform edge perception along with all of our system controls and proprietary closed loop algorithms. Taken together, our integrated hardware and software solution provides OEMs with robust data and insights available to deliver high speed safety features.

Tech Briefs: You’ve also said that your three LiDAR inputs could be fused with radar. What would radar add?

Sumit Sharma: Because we have high frame rates in our system, we can predict the axial and radial velocities of every object. Although the Z component doesn't change that much, the two other components are very important.

If you think about driving on a three-lane highway in the middle lane, the scariest part is when somebody is about 10 meters to your right and you're in their blind spot. You’re doing everything just fine, but they start coming in and they don't see you and you know what's about to happen. So, either you have to yank the steering wheel or apply the break.

But a computer, if it knew the velocity and the changes that are about to happen in the field, it could tell pretty precisely before you can start predicting it, the intent of these vehicles. But to do that, both the sensing and the analysis have to be very fast.

Systems like radar and FMCW (Frequency Modulated Continuous Wave) LiDAR for example, use the Doppler effect. But the Doppler effect only gives you one aspect of velocity. It tells you how fast something is going away from or towards you. But it does not tell you how fast it’s going sideways. But to do these kinds of maneuvers you need both components. Our LiDAR provides that information directly to the system controller.

But because of safety requirements, you have to have redundancy in the system. At this moment in time, the best sensor for velocity is radar — it's on every car, it's a commodity. So, you could reconcile two independent sensors to predict the velocity. The LiDAR provides two components of the velocity and the radar, one. So, the object’s velocity is being confirmed through two independent sensors. If they’re fused, they have the same datum coordinate systems — they're normalized — and that enables the computer system to make a decision faster. Fusing these things together also gives them more certainty.

Time is our most significant budget — how fast can we react? We can certainly include the radar stream. A lot of the algorithms we're using on the LiDAR side can be adapted for improved performance of the radar.

Because the streams are fused and act together, that provides a big benefit to the computer system. It does not have to do all the mathematics — it's all done in the digital signal processor (DSP) and is provided from the LiDAR. So, the computational overhead is reduced.

Tech Briefs: How does your system detect radial velocity?

Sumit Sharma: If you are digitizing many frames at a very fast rate, you could have algorithms running, where you can see these digitized points, and put a cluster around them. I don't need to know if it's a piece of tumbleweed or a Fiat. I know that there's a cluster of points moving together. If I could identify these clustered points, I could determine the velocity of its centroid. As you go frame to frame, that cluster could be shifting forward and back and also radially.

With high-frame rates, you can quickly identify the cluster and start tracking its velocity.

Tech Briefs: But isn't it important to know whether the object is a tumbleweed or a car?

Sumit Sharma: Let me break that down for you. It could be a child on a bike, it could be tumbleweed, it could be a paper bag. Our premise is don't hit any of them, it doesn't matter. If you're going at a very high speed, don't hit any of them because you don't know what the consequences will be. So, if I do a classification and say that's tumbleweed, there's a probability that it is tumbleweed, or it could be a paper bag. One could argue: “Oh, I could probably blow right through it, and I should not be passing or even changing the move rate.”

The premise of ADAS, however, that if this is going to work in every situation globally on every road and people are going to trust the technology, you shouldn’t hit anything. You shouldn’t have to make a choice between five people working on one line of a railroad or one guy working on the other line of the railroad. It should be avoided — hit the brakes, steer, and get away from it.

So, we focus our efforts on whether a space is drivable or not. We tag the point cloud of the cluster and say this cluster is an object you need to avoid. It's almost like there's a green space and a red space.

We don't do the planning and the maneuvering, the OEMs and their software do it, but we pre-tag the point cloud, identifying the space. We try to help them save time by saying, we believe this is drivable space or it's not drivable, even though we do not distinguish between a tumbleweed, a paper bag, or any other solid object.

Tech Briefs: So, you fuse radar and LiDAR. What about cameras, what about video? Can’t that give you more precise information about what kind of object it is.

Sumit Sharma: Classification, which is what you're discussing, we leave to the OEMs at this moment. Based on the work we've done; we don't believe a full-blown camera system fusing is needed for every case. If we look at "drive by wire" then we would fuse standard commodity camera modules with LiDAR for near-field and fused LiDAR and radar for mid- and far-field.

The major tier-one in this space is Mobileye and even the CTO of Mobileye talks about how the next level of innovation is going to be LiDAR and radar fused. If you want to get to a true SAE Level 3 at a high-speed highway pilot system, there cannot enough computing power to fuse video with LiDAR and radar.

We can digitize with our LiDAR at a high enough resolution to cluster the data points and tag them fast enough to reconcile that with a secondary sensor like the radar — that is a big one. You have all the information you need to write computer software that could, without massive amounts of machine learning algorithms, decide where to drive or not, then quickly decide on a course of action.

Tech Briefs: What do you mean that your three different LiDARs are combined in one form factor that complements OEM design.

Sumit Sharma: During all the times that I've spent with OEMs, there are three specific areas that always come up. The LiDAR should be the size of a video cassette or smaller, it has to fit inside the vehicle, and it has to be low power. The aesthetics are very important. If it looks good, selling a car is easier. If you have things sticking out that don't look beautiful, it makes it harder.

Tech Briefs: That’s a complaint I've heard about LiDAR, that you have all this big stuff on top of the car.

Sumit Sharma: We've endeavored to make our device very, very small, optimized for size. There are three spots that manufacturers would like it to fit into so it blends with the body. Number one is to hide it behind the rear-view mirror on the windshield. The windshield is good because you get a high vantage point, which is always nice for anything you're measuring. And if there is some dirt in front, you have windshield wipers — the cleaning system is already incorporated. If you have enough laser power, the windshield can be adapted for that.

The other place they always like to explore is the headlights. Now that they're moving to LED-based headlamps, they have room available.

And the third place, which is where you see it in cars right now, is in the grill. But from what I hear, although they put it there, it’s not preferred. That's where their branding is —the grill is part of the branding — they don’t want tech in the middle of it.

All three of these locations have the same requirements — they all need to have the LiDAR small enough to fit inside the body, perform as promised on the data sheet, and the consumer or the vehicle owner should not see the beauty or shape of the car affected.

To achieve that you must have a very small overall package size. So, we have optimized our package for thinness.

The sample that we talked about in Munich last year, which is our static field LiDAR, is about 34 millimeters tall. Our dynamic field LiDAR, because of some other augmentation that was requested, and we wanted to add some features, is about 40 millimeters tall. But all these sensors have such a low profile they can be hidden inside different parts of the body without breaking the line of the vehicle.

So, it's about respecting that the OEMs know their customers and they know the styling of the car is one of the primary reasons people select a car. More even than particular features, people just want a beautiful vehicle, they want to drive it, and they want to be safe.

Tech Briefs: Will your dynamic LiDAR be economical enough for the consumer automobile market?

Sumit Sharma: Our goal is to develop a solution at a price point that will allow OEMs to embrace LiDAR across their fleets — not just in high-end models. Our supply chain and business model both support that goal. From a technology and supply chain perspective, our solution is based on known and scaled silicon technologies. Our MEMS have already been scaled to a 200- mm wafer and we have started the analog and digital ASIC cycles that will also be in the 200- mm wafer sizes.

Our scaled 905-nm lasers, which are integrated into our custom laser module, as well as our detector, are based on a scaled automotive grade silicon photo multiplier (SiPM). There are no exotic semiconductor technologies inside the LiDAR. The magic comes with the proprietary system design and implementation of the digital and analog ASIC running our custom software. What this all means is that while our technology is very advanced, we’re bringing it to market with an industrialized manufacturing mindset — so that it can be produced at scale, with high quality, and at a cost that makes it feasible to adopt across the consumer auto market.