Researchers from UCLA and the United States Army Research Laboratory have laid out a new approach for enhancing artificial intelligence-powered computer vision technologies by adding physics-based awareness.
Published in Nature Machine Intelligence, their study offered an overview of a hybrid methodology designed to improve how AI-based machinery senses, interacts, and responds to its environment in real time — as in how autonomous vehicles move and maneuver, or how robots use the improved technology to carry out precision actions.
With AI, visual machines can see and make sense of their surroundings by decoding data and inferring properties of the physical world from images. While such images are formed through the physics of light and mechanics, traditional computer vision techniques have predominantly focused on data-based machine learning to drive performance. Physics-based research has, on a separate track, been developed to explore the various physical principles behind many computer vision challenges.
It has been a challenge to incorporate an understanding of physics — the laws that govern mass, motion, and more — into the development of neural networks. AI, modeled after the human brain with billions of nodes, can crunch massive image data sets until they gain an understanding of what the machines “see.” But there are now a few promising lines of research that seek to add elements of physics-awareness into already robust data-driven networks.
“Visual machines — cars, robots, or health instruments that use images to perceive the world — are ultimately doing tasks in our physical world,” said the study’s corresponding author Achuta Kadambi, an assistant professor of electrical and computer engineering at the UCLA Samueli School of Engineering. “Physics-aware forms of inference can enable cars to drive more safely or surgical robots to be more precise.”
The research team outlined three ways in which physics and data are starting to be combined into computer vision artificial intelligence:
Incorporating physics into AI data sets
Tag objects with additional information, such as how fast they can move or how much they weigh, similar to characters in video games.
Incorporating physics into network architectures
Run data through a network filter that codes physical properties into what cameras pick up.
Incorporating physics into network loss functions
Leverage knowledge built on physics to help AI interpret training data on what it observes.
These three lines of investigation have already yielded encouraging results in improved computer vision. For example, the hybrid approach allows AI to track and predict an object’s motion more precisely and can produce accurate high-resolution images from scenes obscured by inclement weather.
With continued progress in this dual modality approach, deep learning-based visual machines may even begin to learn the laws of physics on their own, according to the researchers.