Article by Bahador Khaleghi

Breaking Down AI’s Trustability Challenges

Taxonomy Of Ai Trustability Challenges

14 11 2018

Breaking down AI’s trustability challenges

Trust is a key requirement for the mass adoption, and thus overall success, of artificial intelligence. Building trust isn’t a simple process, however. There are a number of building blocks for trustworthiness that cover every aspect of an AI system.

Other, more mature, disciplines such as medical sciences and engineering have already tackled the issue of trust at a societal level. Their solution usually involves establishment of a set of best practices for trustability by experts in each field, which are enforced by institutions founded specifically for that purpose.

A first step towards those best practices for AI is asking the right questions. We propose a taxonomy model that highlights several key areas of concern for AI trustability at every stage of an AI model: inputs, outputs, retraining, and the model design itself (see image at the top of this article).

Every AI system in production environment has some model at its core that turns some input, such as raw data or pre-processed features, into some output, such as predictions or policies. In addition, there is typically a module that monitors the model’s performance over time and retrains it, if necessary. Given these components, we can identify six challenges in AI trustability:


AI systems need to be robust in their defences against so-called adversarial attacks, which use tailored inputs to trip up existing models and get them to produce mistaken outputs. The emerging field of AI security aims at identifying such vulnerabilities and developing effective defences. An AI system cannot be deemed trustworthy unless it is shown to be robust. Otherwise, its output can be manipulated by an attacker and is therefore unreliable.

Robustness also takes care of broader challenges, such as unintentional adversarial examples — edge cases not represented in the training data that could affect the model’s output. Ideally, a trustworthy AI model is capable of communicating its limits and, when provided with such edge cases, adjusting its output accordingly, such as by reporting a lower certainty in its prediction.

For more on adversarial attacks, such as convincing a facial recognition model that you’re model/actress Milla Jovovich, see this previous post on our blog by Rey Wiyatno.


The input data used to develop an AI model might contain sensitive personal information, and thus preserving privacy is crucial. Without adequate guarantees in place, an AI system will not be able to gain the trust of its users. Recent high-profile consumer data leaks, such as the Equifax breach, have proven extremely costly for the corporations involved, in terms of both consumer trust and the bottom line. Such scandals are a solid proof of the close connection between preserving data privacy and establishing trust.

It is worth mentioning that privacy can come into play for an AI model and its output as well. In other words, it might be desirable to keep the underlying architecture and parameters of a model private to protect intellectual property or communicate a model’s output privately through encryption. It is in the protection of input data, however, where privacy is most crucial.


Though it seems obvious, any trustworthy AI system must be consistent and reproduce the same results from the same inputs. More specifically, an AI system’s performance should be able to be easily replicated if given its specifications, as well as the hyper-parameters set before the model is trained.

Although reproducibility might seem easy to achieve, it can be challenging in practice. The stochasticity of model (re)training process, e.g. due to random seeds, is a common challenge. In addition, specific choices of AI development environment and intrinsic variance of an AI model itself can have outsize impacts on outcomes. An AI system with performance that cannot be consistently replicated will not qualify as trustworthy.


Because AI models are trained on data that reflects how things are, an AI model can capture existing biases in human society. Examples of such biased AI models are fairly common, such as NLP systems that struggle with African-American dialects because they weren’t included in the training data, or sexist image recognition systems that used training image sets associating women with kitchens and men with sports.

A biased AI system will reproduce the problems inherent in its training data, thus, it’s important for researchers to recognize bias in both inputs and outputs. If not addressed properly, a biased AI system will simply amplify our prejudices and consequently exacerbate any issues that are the result of those biases. Building trust means understanding how an AI model was trained and how its outputs reflect that training.


AI systems are commonly used as a way to augment human labor and automate certain tasks. As AI begins to take on more of a critical decision-making role, it’s important to establish accountability for the results. If an AI system is led astray by an invalid input, you need to know who to blame. An AI system deployed in a mission-critical environment must be developed with accountability in mind or it will not be trusted.

The question of accountability is a complicated subject, one that goes beyond the realm of AI system development and requires policy development with input from social and legal scholars. And to assign responsibility for a certain output, we need to understand how the model produced that output. That feeds into the next component of the AI trustability taxonomy, and one that is perhaps the most important in terms of developing trust: explainability.


The big question for AI systems is why. Why does a model operate a certain way? Given a model’s output, what produced that output? What associations has the model learned, and why did it learn them? These questions are self-evident for any predictive system, whether it’s a human or a machine. Yet the uncertainty at the heart of modern AI methods and the vast datasets involved mean that sometimes these questions are challenging or nearly impossible to answer. To build trustworthy AI, researchers have to be able to explain the why of a model’s results.

There is growing literature on AI explainability, and it gets at the question of why and what it means to trust AI. A so-called black box AI system, one that is unable to provide some level of explanation for its output, will be challenging, if not impossible, to adopt in an organization. Common sense is backed up by research studies that show explanations are key to helping users trust automated systems.

Explainability is even more important for trustability when such systems are deployed in regulatory agencies or mission-critical systems that are mandated by law to provide clear justifications for their operations. In the European Union, the new GDPR privacy rules mandate a “right to explanation” for algorithmic decision-making.

Explainability is also in some ways a prerequisite for addressing other trust-related challenges such as bias and accountability. In the AI trustability taxonomy, explainability is the apex. At Element AI, we have a team dedicated to explainability in AI, and the importance of explainability as a serious research area will only grow in the coming years.

What’s next

For AI to be trustworthy, for it to have the trustability necessary for broad adoption, it needs to fulfill all aspects of the AI trustability taxonomy presented here: robustness, privacy, reproducibility, fairness, accountability, and explainability. This is one part of a series that will examine each of these challenges in more detail — see our previous post on adversarial attacks here.

If you are an aspiring data scientist, designer, data engineer, AI expert and/or hacker and find the notion of AI trustability appealing, then check out our careers page here. We are hiring!

Special thanks to Jason Stanley, Archy de Berker, Wei-Wei Lin, Philippe Beaudoin, Xavier Snelgrove, Elnaz Barshan, and Satsuko VanAntwerp for valuable comments and illustrations. Edited by Peter Henderson.