Self-driving vehicles will revolutionize the road, but the AI they use must be continually re-evaluated. (Image: zapp2photo/Adobe Stock)

I started thinking about an important problem with artificial intelligence (AI) when I came across a discussion of whether it was a good idea to use it for Advanced Driver Assistance Systems (ADAS). What struck me was realizing that if something goes wrong while a car is in motion, it is very difficult to discover why the AI made that wrong decision, and therefore, very difficult to figure out how to make sure it doesn’t happen again.

That is a fundamental problem with using AI for any critical applications — we don’t know how it goes about making its decisions. As Professor Christian Lovis of the University of Geneva Faculty of Medicine put it, “The way these algorithms work is opaque, to say the least.”

Interpretability methods have been developed to discover which data an AI system has used and how they have been weighted. “Knowing what elements tipped the scales in favor of or against a solution in a specific situation, thus allowing some transparency, increases the trust that can be placed in them,” said Assistant Professor Gianmarco Mengaldo, Director of the MathEXLab at the National University of Singapore College of Design and Engineering. However, different AI interpretability methods often produce very different results, even when applied on the same dataset and task.

So, to address that problem, a team of researchers from the University of Geneva, the Geneva University Hospitals, and the National University of Singapore has developed a new interpretability method  for deciphering why and how an AI decision was reached.

For example, when AI software analyzes images, it focuses on a few characteristics to enable it to, say, differentiate between an image of a dog and an image of a cat. “The same principle applies to analyzing time sequences: the machine needs to be able to select elements such as peaks that are more pronounced than others — to base its reasoning on. With ECG signals, it means reconciling signals from the different electrodes to evaluate possible dissonances that would be a sign of a particular cardiac disease,” said first author of their study, Hugues Turbé.

The researchers developed two new evaluation methods to help understand how the AI makes decisions: one for identifying the most relevant portions of a signal and another for evaluating their relative importance with regards to the final prediction. They tested their methods on a dataset they developed to verify the reliability. (This dataset is available  to the scientific community, to easily evaluate any new AI aimed at interpreting temporal sequences.)

NIST Tackles AI Risk

The National Institute of Standards and Technology (NIST) has released its Artificial Intelligence Risk Management Framework (AI RMF 1.0)  , a guidance document for voluntary use by organizations designing, developing, deploying, or using AI systems to help manage the many risks of AI technologies. The document addresses a crucial factor that distinguishes AI from traditional software — people.

“While there are myriad standards and best practices to help organizations mitigate the risks of traditional software or information-based systems, the risks posed by AI systems are in many ways unique. AI systems, for example, may be trained on data that can change over time, sometimes significantly and unexpectedly, affecting system functionality and trustworthiness in ways that are hard to understand. AI systems are inherently socio-technical in nature, meaning they are influenced by societal dynamics and human behavior. AI risks — and benefits — can emerge from the interplay of technical aspects combined with societal factors related to how a system is used, its interactions with other AI systems, who operates it, and the social context in which it is deployed (AMI RMF 1.0, page 1).”

Dr. Apostol Vassilev, Research Team Supervisor in the Computer Security Division at NIST, expanded on this theme  . For example, as Vassilev points out, machine learning systems (a subfield of AI) are trained on historical data. But those data “reflect historical biases at the time.”

“I’ve learned through the work of Nobel Prize-winning psychologist Daniel Kahneman and others in behavioral economics that humans are terrible at being consistent in reasoning and coming up with the best solutions. I’m fascinated by how we as people can be both very limited and so creative and capable of deep thinking,” said Vassilev.

Last Thoughts

AI can be incredibly useful in quickly sifting through extremely large amounts of data to extract information by detecting significant patterns — computers are great at that. The only problem is that AI is really a partnership between computers and fallible humans. People’s ideas about what is significant are always evolving — that’s what science is all about — it’s a continuously evolving process. So, the work of groups like those I’ve mentioned here, is vital for constantly monitoring and evaluating AI systems.