Better AI Transparency Using Decision Modeling

Better AI Transparency Using Decision Modeling

Many Effective AI Models Can’t Explain Their Outcome

Some AI models (machine learning predictors and analytics) have strong predictive capability but are notoriously poor at providing any justification or rationale for their output. We say that they are non-interpretable or opaque. This means that they can be trained to tell us whether, for example, someone is likely to default on a loan, but cannot explain in a given case why this is.

Let’s say we are using a neural network to determine if a client is likely to default on a loan. After we’ve submitted the client’s details (their features) to the neural network and achieved a prediction, we may want to know why this prediction was made (e.g., the client may reasonably ask why the loan was denied). We could examine the internal nodes of the network in search of a justification, but the collection of weights and neuron states that we would see doesn’t convey any meaningful business representation of the rationale for the result. This is because the meaning of each neuron state and the weights of links that connects them are mathematical abstractions that don’t relate to tangible aspects of human decision-making.

The same is true of many other high-performance machine learning models. For example, the classifications produced by kernel support vector machines (kSVM), k-nearest neighbour and gradient boosting models may be very accurate, but none of these models can explain the reason for their results. They can show the decision boundary (the border between one result class and the next), but as this is an n-dimensional hyperplane it is extremely difficult to visualize or understand.  This lack of transparency makes it hard to justify using machine learning to make decisions that might affect people’s lives and for which a rationale is required.

Providing An Explanation

There are several ways this problem is currently being addressed:

  1. Train a model to provide rationale. In addition to training an opaque model to accurate classify data, train another to explain it.
  2. Use value perturbation. For a given set of inputs, adjust each feature of the input independently to see how much adjustment would be needed to get a different outcome. This tells you the significance of each feature in determining the output and gives insight into the reasons for it.
  3. Use an ensemble. Use a group of different machine learning models on each input, some of which are transparent.
  4. Explanation based prediction. Use a AI model that bases its outcome on an explanation so that building a rationale for its outputs is a key part of its prediction process.

Model Driven Rationale

In this approach, alongside the (opaque) machine learning model used to predict the outcome we train another (not necessarily of the same type) to provide a set of reasons for the outcome. Because of the need for high performance (in providing an accurate explanation), this second model is also opaque. This ensemble is represented using the business decision modelling standard DMN, on the right.

The left hand model explains the outcome generated by the right

DMN DRD Showing Two Model Ensemble: One Model Explains the Other

Both models are trained (in parallel) on sample data: the first labelled with the correct outcome and the second with the outcome and a supporting explanation. Then we apply both models in parallel: one provides the outcome and the other the explanation. This approach has some important drawbacks in practice:

  • It can be very labour intensive to train the explanation model because you cannot reply on historical data (e.g. a history of who defaulted on their loans) for the explanation labels. Frequently you must create these labels by hand.
  • It is difficult to be sure that your explanation training set has full coverage of all scenarios for which explanations may be required in future.
  • You have no guarantee that the outcome of the first model will always be consistent with that of the second. It’s possible, if you are near the decision boundary, that you might obtain a loan rejection outcome from the first model and a set of reasons for acceptance from the second. The likelihood of this can be reduced (by using an ensemble) but never eliminated.
  • The explanation outputs stand alone and no further explanation is available because the model that produced them is itself opaque.
  • The two models are still opaque, so there is still no general understanding of how they work outside the outcome of specific cases.

 

Value Perturbation

This approach uses a system like LIME to perturb the features of inputs to an opaque AI model to see which ones make a difference in the outcome (i.e., cause it to cross a decision boundary). LIME will ‘explain’ an observation by perturbing the inputs for that observation a number of times, predicting the perturbed observations and fitting an explainable model to that new sample space. The idea is to fit a simple model in the neighbourhood of the observation to be explained. A DMN diagram depicting this approach is shown below.

LIME uses the opaque AI module to classify perturbed variants of the data to give an explanation for the outcome

DMN DRD Showing How LIME Produces Explanations for Opaque AI Modules

In our loan example, this technique might reveal that, although the loan was rejected, if the applicant’s house had been worth $50000 more or their conviction for car-theft was spent they would have been granted the loan. This technique can also tell us which features had very limited or no impact on the outcome.

This technique is powerful because it applies equally to any predictive machine learning model and does not require the training of another model. However it has a few disadvantages:

  • Each case fed to the machine learning model, must be replicated with perturbations to every feature (attribute) in order to test their significance so the input data volume to the machine learning model (and therefore the cost of using it) rises enormously.
  • As above, the model is still opaque, so there is still no general understanding of it works outside the outcome of specific cases
  • As pointed out in the original paper on LIME, It relies on the decision boundary around the point of interest being a good fit with the transparent model. This may not always be the case, so, to combat this, a measure of the faithfulness of the fit is produced.

Using an Ensemble

The use of multiple machine learning models in collaborating groups (ensembles) has been common practice for many decades. The aim of bagging (one popular ensemble technique) is typically to increase the accuracy of the overall model by training many different sub-models on the same data so they each rely on different quirks of that data. This is rather like the ‘wisdom of crowds’ idea: one gets more accurate results if you ask many different people the same question because you accept their collective wisdom whilst ignoring individual idiosyncrasies. An ensemble of machine learning models is, collectively, less likely to overfit the training data. This technique is used to make random forests from decision trees. In use, the same data is applied to many models and they vote on the outcome.

This technique can be applied to solve transparency issues by combining a very accurate opaque model with a transparent model. Transparent models, such as decision trees generated by Quinlan’s C5.0, or rule sets created by algorithms like RIPPER (Repeated Incremental Pruning to Produce Error Reduction)[1], are typically less accurate than opaque alternatives but much easier to interpret. The comparatively poor performance of these transparent models (compared to the opaque ones) is not an issue because, over the entire dataset, the accuracy of an ensemble is usually higher than that of the best member providing the members are sufficiently diverse. A DMN model explaining this approach is shown below.

The approximate transparent model provides an explanation for the accurate opaque model.

DMN DRD Showing Ensemble of Opaque and Transparent AI Modules

However the real advantage of this approach over the others is that because the transparent models are decision trees, rules or linear models they can be represented statically by a decision service. In other words, the decision tree or rules produced by the transparent analytic can be represented directly in DMN and made available to all stakeholders. This means that this approach not only provides an explanation for any outcome, but also an understanding (albeit an approximation) of how the decision is made in general, irrespective of any specific input data.

Using Ensembles in Practice

The real purpose of these transparent models is to produce an outcome and an explanation. Clearly the explanation is only useful if the outcome of the transparent and opaque models agree for the data provided. This checking is the job of the top level decision in the DRD shown. For each case there are two possibilities:

  • The outcomes agree, in which case the explanation is a plausible account of how the decision was reached (note that it is not necessarily ‘the’ explanation as it has been produced by another model).
  • The outcomes disagree, in which case the explanation is useless and the outcome is possibly marginal. In a small subset of these cases (when using techniques like soft voting) the transparent model’s outcome may be correct making the explanation useful. Nevertheless the overall confidence in the outcome is reduced because of the split vote.

The advantages of this approach is the fact that we have a static representation of an opaque model in the DMN standard which gives an indication of how the model works and can be used to explain its behaviour both in specific cases and in general.

If the precision of our opaque model is 99% and that of our transparent model is 90% (figures obtained from real examples) then the worst case probability for obtaining an accurate outcome and a plausible explanation is 89%. Having a parallel array of transparent decision models would increase this accuracy at the cost of making the whole model harder to understand. Individual explanations would retain their transparency.

Explanation Based Prediction

This is a relatively new approach typified by examples like CEN (contextual explanation networks) in which explanations are created as part of the prediction process without overhead. This is a probabilistic, graph based approach based on neural networks. As far as we are aware there are no generally-available implementations yet. The key advantages of this approach are:

  • The explanation is, by definition, an exact explanation of which a given case yielded the outcome it did. None of the other approaches can be sure of this.
  • The time and accuracy performance of the predictor is broadly consistent with other neural networks of this kind. There are no accuracy or rum-time consequences for using this approach.

The disadvantages with this approach is that it only works with a specific implementation of neural networks (which is not yet generally supported) and only yields a case-by-case explanation, rather than a static overview of decision-making.

Conclusion

Decision Modelling is an invaluable way of improving the transparency of certain types of narrow AI modules, machine learning models and predictive analytics which are almost always non-transparent (opaque). A particularly powerful means of doing this for datasets with fewer than 50 features is to combine both opaque and transparent predictors in an ensemble. The ensemble can then provide both an explanation for the outcome of a specific case and a general overview of the decision making process.

The transparent model uses a decision tree or rules to represent the logic of classification. Using DMN to represent these rules provides a standard and powerful means of making the model transparent to many stakeholders. DMN can also be used to represent the ensemble itself as shown in the articles’ examples.

Dr Jan Purchase and David Petchey are presenting two examples of this approach at Decision CAMP 2018 in Luxembourg, 17-19 September 2018. Why not join us?

Acknowledgements

My thanks to David Petchey for his considerable contributions to this article. Thanks also to CCRI for the headline image.

[1] Repeated Incremental Pruning to Produce Error Reduction (RIPPER) was proposed by William W. Cohen as an optimized version of IREP. See William W. Cohen: Fast Effective Rule Induction. In: Twelfth International Conference on Machine Learning, 115-123, 1995.

[2] Contextual Explanation Networks Maruan Al-Shedivat, Avinava Debey, Eric P. Xing, Carnegie Mellon University, January 2018.

Improve Your Chances of AI Success with Decision Modeling

Improve Your Chances of AI Success with Decision Modeling

Increasing Use of Narrow AI in Business Automation

In this series of articles I explore how decision management and modeling can help increase the success of AI deployments.

As part of their digital transformation initiatives, companies are going beyond the established use of predictive analytics by embedding ‘narrow’ artificial intelligence models within their automated systems. The outcomes of these models directly control the system’s actions. Not to be confused with Strong AI, which attempts to approximate general human intelligence, this ‘Narrow’ (or ‘Weak’) AI uses machine learning to supplement (or even replace) human judgement in very specific areas. Typically these systems automate the acquisition of business insights from data, previously requiring human experience, and then act on these insights with or without human supervision.

AI models provide observations that can be used by companies to: segment and target higher value customers; personalize service and product offerings to better anticipate customer requirements and predict and avoid customer churn. By these means companies hope to acquire, satisfy and retain higher value customers. Other uses of narrow AI include: optimizing transport logistics by anticipating demand for products; detecting fraud in real time and automating market sentiment analysis to understand the public mood and anticipate market changes.

Contrary to the hype, many are discovering that this use of narrow AI and machine learning does not guarantee success and has its own drawbacks. Frequently, initial attempts to use AI models can be expensive and of surprisingly limited value. Highly trained and expensive personnel and sophisticated, high-performance hardware can balloon budgets while the promised business benefits remain elusive. Why is this?

Narrow AI Informs Business Decisions

Some projects lose sight of the fact that narrow AI models are used, first and foremost, to inform a business decision, either directly by predicting something a business can act on to make the decision more profitable or, indirectly, by providing insight into the hidden relationships in business data that might be exploited. This is often a commercial decision such as:

  • Should we target this customer for sales and services? What’s their risk/reward profile?
  • What kind of customer is it: to which products and pitches would they be eligible and most responsive?
  • What specific actions should we take to pre-emptively please customers and prevent churn?
  • How can we reduce costs and minimize delays by predicting demand for products and services?
  • Does the pattern of customer behaviour suggest fraud or some other danger (e.g., incipient insolvency)?

It may cover automation of decisions which previously required human guidance:

  • Are recent changes in blood chemistry indicative of a medical condition?
  • Given past behaviour is this employee likely to be unreliable?

When AI projects lose sight of this business decision they deliver less benefit as a result.

This is because an accurate definition of this business decision and its business context are valuable assets in selecting, training and using machine learning techniques. Decisions help to focus the application of each AI model, define its business value and guide its evolution.

How Decision Models Help

In this series of posts we’ll be looking specifically at how decision models help to:

  • Define a Business Context: to show how the use of AI models fit into the big picture—how they collaborate to generate business insight, what their requirements are and exactly how they impact company behaviour.
  • Improve the Transparency of AI models: to make even the most opaque AI models yield an explanation for their outcomes—essential to meet the increasing public and regulatory demand for transparent decision-making in all areas of business.
  • Define Business Goals: to focus AI projects and provide both a business case and a means to measure their success.
  • Help SMEs to Steer AI: providing the best fusion of machine learning and human expertise by depicting how existing expertise informs and constrains AI.

In this article, let’s look at the first of these…

How Decisions Clarify the Business Context of AI

Business decisions provide a defining context for the use of AI Models and this is important because…

Each AI Model Should Be Applied to a Well-Defined Task

Narrow AI models are best developed for a specific purpose; to be used at a specific time and under specified circumstances. They rely on training and test data that are well understood and of adequate quality for the task at hand. In many circumstances they collaborate with business rules and other analytic models to achieve a business outcome. They perform poorly if their purpose is vaguely specified or appropriate data is not provided. They also behave poorly if they are unfocused and try to address more than one issue simultaneously (the ‘jack of all trades’ problem).

Defining a decision model is an excellent way of expressing both the ‘big-picture’ and the specific details of the context in which AI models are used. A DMN decision requirements diagram (DRD) shows how AI models collaborate with other sources of business information and knowledge to make a specific contribution to a business decision. It drives out the data and knowledge requirements of models and establishes a clearly defined relationship between them and the business decisions that use them. Such a context defines the inputs and outputs of the AI models and how these align with the key business actors, processes (using process oriented metadata in the DRD) and rules.

Example Decision Model

Consider the example DMN decision requirements diagram (DRD) below. This diagram shows how two AI models collaborate to determine the details of a client mortgage offer. For clarity, we’ve added some colour to better illustrate the integration of narrow AI (note: this isn’t part of the DMN standard). Real DMN DRDs are, at this level, agnostic about the technology used to implement decisions.

 

AI models (green) contribute to a decision model

A Decision Model DRD Illustrating the Context for AI Two Models (in Green)

Be aware that there is much more to a DRD that just a box and line diagram (see below) and much more to a decision model than just a DRD. However, just this single diagram tells us a lot about the use of narrow AI in this example.

  • The yellow rectangles present conventional decisions based on business rules or human decision making.
  • The green decisions (rectangles) in this example are AI models. The Determine Property Risks decision uses Bayesian inference to classify the key physical perils within the vicinity of the property and Determine Likely Property Value uses a neural network to estimate the value of the property given its survey details. This highlights the fact that each decision in a DRD can be underpinned by business rules, machine learning models or any other executable representation. The DRD is agnostic of this and just shows the dependencies between them all.
  • The yellow knowledge sources (with wavy bases) are sources of knowledge that inform or constrain their respective decisions. They represent guidelines, policies, regulations, legal mandates, etc.
  • The blue knowledge sources, for decisions using weak AI, represent training and test data sets used to build the model. This representation holds for batch and continuously trained machine learning models.
  • The rounded rectangles represent data sources.
  • The lines represent information (or knowledge) dependencies. In each case item at the ‘arrow’ or ‘circle’ end of the line is dependent on the information or knowledge made available by the item at the ‘plain’ end.

How This Helps

Notice how the DRD:

  • Allows us to be explicit about the combined use of AI and business rules in business decision-making, so that all stakeholders can see the extent to which AI is being used and how.
  • Clearly defines all the dependencies in our decision making, helping us to understand and manage the collaboration and handle change in individual parts as well as informing all stakeholders precisely how AI is contributing.
  • Captures all the data requirements of the AI models—both in training and in use, in addition to the rest of the decision-making elements. The process of completing a DRD drives out the information required to support models and its relationships with data used elsewhere in the same decision. It can also identify missing data early.
  • Allows us to document the collaboration of AI modules using a recognized international standard for decision modeling: DMN. This means we can share our models more easily with others and take advantage of the many DMN tools on the market.

The DRD is just the diagrammatic representation the dependencies between the elements of decision-making. The DMN standard also defines many properties for each of the symbols on the DRD that provide a wealth of additional information (such as its business goal and key performance indicators). In addition, the decision logic diagram provides detail on the business behaviour of each decision-making element. In the case of a decision using narrow AI, this could be a reference to a specific machine learning model and a definition of its interface.

Conclusion

The process of decision modelling adds much needed rigour to the design of AI model deployments, forcing AI engineers to be explicit about their models’ business contribution, data requirements and use scenarios early.

Currently, the AI and machine learning community have no notation to express how models interact with input data, training data and each other to achieve business outcomes or how their use fits into a business process. DMN (and, by association, BPMN) is well placed to fulfil this role. DMN could be used, for example, to illustrate the architecture of AI model ensembles and to show, using conventional decision tables, how ensemble voting works.

In our next article we consider how decision modelling can improve the transparency of notoriously inscrutable AI models like neural networks and support vector machines…