Compact Depth Sensor Image
The flat metalens (shown in the middle) captures images of a 3D scene, for example candle flames placed at different locations (left) and produces a depth map (right) using an efficient computer vision algorithm which is inspired by the eyes of jumping spiders. The color on the depth map represents object distance. The closer and farther objects are colored red and blue respectively. (Image courtesy of Qi Guo and Zhujun Shi/Harvard University)

Inspired by spiders, researchers at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS), have developed a compact and efficient depth sensor that could be used onboard microrobots, in small wearable devices, or in lightweight virtual and augmented reality headsets. The device combines a multifunctional, flat metalens with an ultra-efficient algorithm to measure depth in a single shot.

Many of today’s depth sensors, such as those in phones, cars, and video game consoles, use integrated light sources and multiple cameras to measure distance. Face ID on a smartphone, for example, uses thousands of laser dots to map the contours of the face. This works for large devices with room for batteries and fast computers, but what about small devices with limited power and computation, like smart watches or microrobots?

Humans measure depth using stereo vision, meaning when we look at an object, each of our two eyes is collecting a slightly different image. Try this: hold a finger directly in front of your face and alternate opening and closing each of your eyes. See how your finger moves? Our brains take those two images, examine them pixel by pixel and, based on how the pixels shift, calculate the distance to the finger.

“That matching calculation, where you take two images and perform a search for the parts that correspond, is computationally burdensome,” said Todd Zickler, Professor of Electrical Engineering and Computer Science at SEAS. “Humans have a nice, big brain for those computations, but spiders don’t.”

Jumping spiders have evolved a more efficient system to measure depth. Each principal eye has a few semi-transparent retinae arranged in layers, and these retinae measure multiple images with different amounts of blur. For example, if a jumping spider looks at a fruit fly with one of its principal eyes, the fly will appear sharper in one retina’s image and blurrier in another. This change in blur encodes information about the distance to the fly.

In computer vision, this type of distance calculation is known as depth from defocus. But so far, replicating Nature has required large cameras with motorized internal components that can capture differently-focused images over time. This limits the speed and practical applications of the sensor.

That’s where the metalens comes in.

SEAS researchers have already demonstrated metalenses that can simultaneously produce several images containing different information. Building off that research, the team designed a metalens that can simultaneously produce two images with different blur. Instead of using layered retinae to capture multiple simultaneous images, as jumping spiders do, the metalens splits the light and forms two differently-defocused images side-by-side on a photosensor. An ultra-efficient algorithm then interprets the two images and builds a depth map to represent object distance.

“Metalenses are a game changing technology because of their ability to implement existing and new optical functions much more efficiently, faster, and with much less bulk and complexity than existing lenses,” said Frederico Capasso, Professor of Applied Physics and Electrical Engineering. “Fusing breakthroughs in optical design and computational imaging has led us to this new depth camera that will open up a broad range of opportunities in science and technology.”