Did you see? New software developed at Saarland University turns any camera into an eye-contact detector. Why is it so valuable to identify eye contact? We spoke with the inventor about new kinds of applications enabled by the technology.

Andreas Bulling is head of the Perceptual User Interfaces Group at the Max Planck Institute for Informatics and the Cluster of Excellence on Multimodal Computing and Interaction (MMCI) at Saarland University. Together with his PhD student Xucong Zhang and former PostDoc Yusuke Sugano, Bulling developed the detection method, which employs a new generation of algorithms for estimating gaze direction.

Using a special type of “Deep Learning” neural network, the system can spot a person’s gaze independent of environment, imaging position, and object target size.

Tech Briefs: Why is the detection of eye contact with a target so challenging?

Dr. Andreas Bulling: Eye-contact detection is a special case of the more general problem of gaze estimation. While gaze estimation is about figuring out automatically, from an image of the eye, where a person is looking exactly, eye contact is about a binary decision: "Is the user looking at Target X?"

Using only a single camera, the new software detects whether one or even several people establish eye contact with a target object (green box) or not (red box). (Credit: Saarland University)

Gaze estimation is a long-standing problem in image processing and computer vision, with recent efforts focusing on using off-the-shelf cameras (e.g., the kind integrated into laptops). A core challenge here is the image resolution, as well as the large variability in eye appearance, illumination conditions, users' head pose, and the fact that this is inherently a 3D task.

Solving these challenges would enable so-called calibration-free gaze estimation, in which a user can simply sit in front of a computer and the integrated cameras could directly provide accurate and robust gaze estimates without any personal calibration of the system. Eye contact detection shares the same challenges with gaze estimation.

Tech Briefs: How does your technology work? How is eye contact detected?

Dr. Andreas Bulling: We build on recent advances in learning-based gaze estimation. We obtain gaze estimates using a pre-trained gaze estimation model — a “deep learning”-based method.

Gaze estimates are way too inaccurate. So, the trick is that we first cluster these estimates; we figure out local areas of high gaze estimate density, assume the closest cluster to correspond to our target object, and finally take all gaze estimates from that cluster to retrain our original model on the fly to obtain a target-specific eye-contact detector.

Tech Briefs: Why is it important to detect eye contact?

Dr. Andreas Bulling: Eye contact is a fundamental social signal and indicates what we are interested in and visually attend to. Given that we live in an “age of interruptions" and an “attention economy,” I would argue that attention has emerged as one of, if not the most important, kinds of information we can obtain from people.

Tech Briefs: What kinds of applications are possible now that eye contact can be detected?

Dr. Andreas Bulling: Eye contact can be detected on advertisements in public spaces at scale. Also, eye-contact detection during human-robot interaction will facilitate turn-taking and communicate mutual understanding and rapport.

Attentive user interfaces in automotive could, for example, detect whether the driver has seen a notification or warning on the dashboard. If a driver is looking away from the street or is distracted, driving assistance could automatically be activated.

The system also has applications in autism research and treatment, as a tool to automatically assess if/how often autistic patients engage in eye contact, and whether this behavior improves through the training.

Tech Briefs: What is most exciting to you about this technology achievement?

Dr. Andreas Bulling: The technology allows, for the first time, the measurement of eye contact or attention in everyday situations, and at scale without any user intervention. Users do not have to be equipped with an eye tracker or calibrate to any system. And what's even better: The longer our method is deployed for a particular target object, the better the eye contact detection gets.