Imagine you decide to sell your home.
You contact a realtor and ask them to value your house. However, if the realtor doesn’t offer any insight into how they arrive at their estimate—for example, what factors about the home they considered and which factors they felt were most important—then you’re unlikely to have confidence in that estimate. You also won’t know how you might increase the value of your home.
TRAINING MACHINE LEARNING ALGORITHMS TO BE USEFUL
We have a similar situation with machine learning models. If I want to build a model that predicts the value of the homes in my neighborhood, I’d start by deciding which features of a home I think are relevant—the square footage, the number of bedrooms, and the number of bathrooms, for example. I would then find a list of recent home sales, build a set of data that included the features’ values, and note the final sale price of each home sold. I could then use this data to optimize or “train” a machine learning (ML) algorithm to predict a house’s value based on its features.
In the language of ML, the process of using a training set is called supervised learning, and the trained algorithm is called a model. The accuracy of supervised ML models is highly dependent upon their training set. The data must represent the entire problem space being modeled, and the data must be both accurate and unbiased. Considering our earlier example, once satisfied with the accuracy of my model, I can now cheerfully drive around my neighborhood estimating the value of other people’s homes by inputting their homes’ features.
THE NEED FOR INSIGHT
When presented with a prediction or decision, there are numerous reasons why we want to understand how that outcome was derived. One of the most important is to have confidence in the outcome. Understanding how predictions and decisions are made can also provide actionable insights into the situation being modeled. For example, if I know that my realtor attributes more value to a new kitchen than a new driveway, I can target any future investment in my home to maximize its value.
INTERPRETABILITY AND EXPLAINABILITY
The situation is similar for ML models. I want to have insight into how they’ve arrived at their predictions and classifications. Not only will this build my confidence in a model’s outcomes, but it can help me improve the performance of the model, such as by identifying gaps in the input feature set or gaps and biases in the training data.
Additionally, if I am using ML to predict future problems, then having insight into how the prediction was arrived at can help me identify the likely root cause of the problem. I can then use this insight to address the root cause before the predicted problem actually occurs.
APPROACHES TO EXPLAINABLE AI
In the world of AI, an ML model is said to be interpretable or explainable if a human can understand how the model arrives at its outcomes. This could relate to the model’s overall behavior, referred to as global interpretability, or a specific outcome, referred to as local interpretability or explainability.
There are many different types of supervised ML algorithms. Some algorithms are inherently interpretable, such as decision trees and linear regression models, especially with the help of analytical techniques.
However, it is impossible for a human to directly interpret highly complex “black box” ML models, such as deep learning neural networks, which can contain millions of operations and weights. One approach to this problem is to use a simple interpretable model to approximate the behavior of a more complex and accurate model. That way, we can have at least some insight into how the complex model works.
In the world of image recognition, features actually emerge in the neural network through the training process. For example, certain parts of a neural network can be shown to activate in response to specific components of an input image—edges, different types of texture, or faces, for example.
It is possibly only a matter of time before large language models like ChatGPT and Bard will provide explanations that justify their output. Whether we should trust these explanations any more than the output itself is an interesting question.
THE FLIP SIDE OF EXPLAINABILITY
AI explainability can also be used by an adversary. For example, if I understand how a credit scoring model works, I may be able to game the system by manipulating the answers I submit.
In the world of image recognition, researchers demonstrated how they could force a particular model to always classify images as a toaster by adding a specially crafted item to the input image. A worrying use of this approach would be to use explainability analysis of a security image recognition system to find ways to circumvent it.
CONCLUSION
Although we hear about developments in the application of AI almost daily, we hear little about the need for explainability. However, as the adoption of AI accelerates, the need for explainable AI is becoming increasingly important—and not only for the reasons already discussed but also in the context of risk and compliance.
Fortunately, the data science community has been researching different approaches to explainability for many years. For example, Shapley value analysis was first postulated in 1956, while the last decade has seen a lot of progress in relation to deep learning AI/ML. Nevertheless, most current approaches to AI interpretability and explainability are far too complex for anyone but a data scientist to use and interpret.
What we need are AI algorithms that can reliably explain themselves to their users. And yet, if we ignore the need for explainability in AI and ML, it will be at our peril as we throw ourselves to the mercy of algorithms we simply don’t understand.
Paul Barrett is CTO for NETSCOUT, overseeing development of network assurance and cybersecurity tech for the world’s largest organizations.
The super-early-rate deadline for Fast Company’s Most Innovative Companies Awards is Friday, July 25, at 11:59 p.m. PT. Apply today.