Contactez-nous Suivez-nous sur Twitter En francais English Language

De la Théorie à la pratique

Freely subscribe to our NEWSLETTER

Newsletter FR

Newsletter EN



AI Trustworthiness: Defining the Main Principles for ensuring AI safety

September 2017 by Eric Stefanello, President of Difenso

In a world where the rapid development of AI, especially since the dawn of deep learning, has raised fears and misunderstandings, the purpose of this article is to try to set out the initial grounds for defining what AI trustworthiness could and should be in the future. In this article, we define what deep AI is and how its behavior is specific in terms of determinism. Traditional methods to ensure IT trustworthiness are considered as well as how they may or may not fit with AI trustworthiness issues. AI cybersecurity issues are also examined.

Subject terms: Computer science – Software trustworthiness

Artificial Intelligence (AI) trustworthiness
What is AI? Connected and non-connected AI (definition)
Simple AI and deep AI: definitions (connection to data, continuous learning, self-programming AI, multidimensional neural networks)
Deterministic and non-deterministic behavior of AI
Legal consequences
Specific problems raised by AI connected to physical systems without a human in the loop
Main patterns used to secure IT programs in industry
Applicability of such patterns to AI connected to physical systems
AI Cybersecurity
New Patterns for securing deep AI

1-1 What we will call AI in this article refers solely to neural-network based AI. When speaking of other of types of AI (expert systems, graph path algorithms, etc.) we will specify their special nature. We do believe that, so far, only neural-network AI raises real concerns when it comes to safety. The AI systems in this article may be based solely on software, solely on hardware or on a mix of software and hardware.
AI may or may not be connected to physical systems. An AI system is considered non-connected to a physical system, when it delivers its results, whatever they are, exclusively as information displayed to humans or to other IT systems which themselves are not connected to physical systems. An AI system is considered as connected to a physical system when the results it delivers are directly used to generate tangible actions in the physical world. Note that if a person is involved in the decision loop of action (meaning that this person has to perform another specific and independent action to allow the AI proposal to be converted into a physical action), the AI is considered non-connected. If the person in the decision loop is only controlling the AI outputs, with the AI remaining autonomous to perform the action, we then consider such AI as connected to the physical system.

1-2 Simple AI systems, are those performing a unique set of functions of the same kind. An AI system performing image recognition is considered as simple AI, whatever deep learning methods are used to build it and whatever set of input data and output results are used to process the learning phase. Deep AI systems are those carrying out one or several of the following:
- They include various simple AI systems (called sub AIs) thus being able to perform a various set of different functions. The ultimate decision-making system above those various sub AIs is also an AI system which can interact and trigger parameters in any layer of each sub AI.
- They can connect themselves to a global set of new input data (net wise) and get feed-back on their results without any human intervention (unsupervised deep learning). As a consequence, they can improve their learning autonomously.
- They can replicate themselves autonomously in various hardware on virtual machines,
- They can analyze and interpret natural language: understand many types of questions, synthesize the meaning of many types of text
- They continuously improve their learning without human intervention
- They are naturally connected to the web and can perform searches autonomously
- They can create new patterns of neural networks to improve their learning and autonomously check if the outputs they get from them fit with the knowledge they get from the results of other sensors or sub AIs It is obvious that these various characteristics can depict very different types of AI and that what we call here “deep AI” is in fact a whole family of AI systems with very different performances. At this stage let us consider them on a continuum from the simplest deep AI system (able to connect to new data), to the most complex, which would include all of the previous features. Those concepts of deep AI are very close to the concept of Superintelligence developed by Nick Bostrom (1) and could easily lead to all the threats to mankind he points out. Nevertheless in this paper we are staying down to earth focused on a IT program safety approach.

1.3 Traditional software programs, be they process-oriented or object-oriented are always deterministic from a process point of view. Even if you use stochastic methods which give results based on probabilities (ie indeterministic result wise) the methods and the process used are in fact deterministic (deterministic process wise). That means that: (i) you can record their behavior and precisely follow the process they go through to transform a set of inputs (data or human interactions) into another set of outputs. (ii) If you replay the process with the same set of inputs (at the same time, if we consider also real-time systems) the program will deliver the same set of outputs AND the two behavior records (meaning the various calculation paths it followed) will be the same.
In other words, the behavior of a traditional program is fully deterministic. Let us now analyze the behavior of very simple AI. An image recognition AI with a given set of input images will always produce the same results by classifying into different lists, the trains, the buildings and the people. This is true even if this Ai is using stochastic methods.
In other words, using stochastic methods do not imply non-determinism in the real world, at least for how the terms determinism/indeterminism are used in physics (5).

A first question arises if you consider the process this AI is following. If you perform the same recognition process one week later and if during this week, you have improved the AI’s image recognition ability, the same set of inputs will give the same lists of results, BUT the process it follows will probably be different. One could argue that this new learning has changed the AI system, transforming it into a new version of the program. This cannot be defined as non-deterministic behavior as such (as the AI is no longer “the same”). This argument would be acceptable if you consider that you can accurately record the various processes the AI goes through to produce the results. The problem is that it is in fact very difficult, if not impossible, to record the process followed by an AI system, as it doesn’t “run processes” but instead triggers massively parallel sets of artificial neurons in real time.

Let us then simplify things by considering that such a simple AI system is deterministic in terms of its results (the same inputs will trigger the same results) but non-measurable in terms of its process (you don’t really know how exactly it produces the results). Determinism is indeed the key issue that needs to be considered when it comes to ensuring AI safety. Let us go further by considering recent improvements made to AI systems. AlphaGo (2), Google’s AI playing Go, gathers a tree search procedure and three convolutional network that guide the tree search procedure (two policy networks and one value network):

Deep reinforcement learning (when the policy networks were trained to play with each other to improve their performances) has led to the creation of Go strategy patterns that had never been seen in Go history. The progress made made by Alpha Go during the year after the first match with Lee Sedol (already won by the machine) are huge. If there is a first kind of singularity in AlphaGo history, this is it. The important thing is not that the machine beat the human. The important thing is that it did so in a way the human would have never have expected: AlphaGo’s performance left Ke Jie (the world’s best Go player who played against AlphaGo and lost) “shocked” and “deeply impressed” in May 27 2017 post-match statements, noting that the moves the computer played “would never happen in a human to human match”.
AlphaGo could then be considered as one of the first deep AI systems (self-reinforcement learning of two different sub AIs – the two policy networks) created by humans. Its behavior is already highly non-deterministic, as how it performs (Go strategy) is unknown to humans.
Deep AI systems, as described above, will all be highly non-deterministic. Let us take this as a first conclusion from which we will analyze the various consequences.

1.4 This inherent non-determinism in deep AI (process wise), is already a good place to start in this chapter - a short jump away from science and a look at the legal issues. Here we should start from how humans are already coping with non-determinism. From weather forecasts to the effects of drugs on human bodies, mankind has tried to master the risks of non-deterministic events. Against the bad or catastrophic effects of weather, we choose insurance: the risks cannot be eradicated so we try to control the consequences of the risks. If we consider medicine, we know there is a part of non-determinism in the way each and every human body reacts to one medicine. We use statistics and probabilities to master such risks: medical trials from Petri dishes, to animals, to testing patients, enable the risks of undesirable effects to be measured and decided if they are bearable or not. Continuous monitoring of side effects in populations using the same drug is thus improving our knowledge and helping us to control the risks. Nevertheless, from a legal point of view, weather forecasting and the pharmaceutical industry are vastly different. If a problem occurs with a medicine, you have a party to turn against and sue. With weather, you don’t and only states can then create specific programs to support weather victims when they are not insured.
AI is in fact more like weather than medicine. AI programmers (if this word still has a meaning) or at least the company which delivers an AI system or service to a customer, may be thought of as entirely responsible for the AI’s behavior. This is true for simple AI systems but absolutely not for deep AI. Indeed, as soon as a deep AI system has been delivered, it will be affected by the interactions it has with you and with all the data it encounters. Very soon, it will in fact be a “different” program (if we can call AI a “program”) than the one delivered initially. So, it is not easy to determine which party to turn against. Moreover, the inner non-deterministic behavior of deep AI will mean that nobody is really responsible for its behavior. Is a storm legally responsible? No, it bears no legal personality. From a legal point of view, we can hardly consider deep AI as having a legal personality. It is more like a wild animal whose behavior you can never really be sure of or trust. If you assign a legal personality to deep AI, that means you would consider it as somehow “conscious” like humans (or even further ). Artificial consciousness is another field of consideration, totally outside the scope of this article and still today outside the scope of science.

1.5 We know and use daily IT programs connected to physical systems without any humans in the loop. Automatic subways or modern aircraft run such programs completely safely. Indeed, however complex they may be, they are and remain whatever happens, fully deterministic.
Considering AI or deep AI connected to a physical system without a human in the loop raises many new questions. If it is a very simple AI system, for instance an image recognition system, analyzing your face to decide whether or not to open a door, it is not a major problem as the correlated risks are low. But if the same AI is supposed to spray nerve gas onto unauthorized visitors, you’d better start having a doubt. In fact, you should not: if the AI system has been proved to have fully deterministic behavior, it is then possible from an IT safety point of view (see below) to predict and prove a given level of safety for this AI; for instance, 1 chance in 10-12 for a problem to occur.
Aircraft and nuclear arms controls rely on IT whose behavior can be measured and monitored. A new frontier has been reached when you want to use some non-deterministic IT such as deep AI. Connecting deep AI to physical systems will trigger new problems and a specific approach to AI safety is now necessary.
A simple AI-based robot mimicking a human walking, is connected to a physical system (the environment in which it moves) and should then be insured with specific security and safety measures to ensure that its behavior will not adversely affect any surrounding humans.
Self-driving cars are and will be IT systems connected to physical systems (connected to us to start with). They can be fully deterministic IT systems and hence traditional methods (see below) will be used to ensure safety. But if they are based on, or embark deep AI, very specific new methods will have to be developed and embarked to ensure safety.

1.6 Main patterns used to secure IT programs in industry
Securing IT programs embedded in systems connected with the physical world, is something that has been very well known for years. It relies on a full set of specifying, programming and testing methodologies. In terms of programming we can quote the following ones.
When it comes to a very high level of safety (and security) requirements, a mix of specific secured hardware and secured software is mandatory. This is the case in nuclear armament systems, be it for systems control or for targeting. Ensuring the trustworthiness of aircraft real-time flight control systems, depends on 3 to 5 different hardware platforms each running 5 different software programs (meaning developed and tested independently by different teams). From the inputs provided by various aircraft sensors, they each make calculations and deliver 5 different sets of results. A final voting system (itself often doubled) compares the various results and eliminates 1 or 2 sets if they are considered too different from the other three. You then get a heavily robust system which can still deliver reliable results, even if 2 or 3 of the calculators have deep failures. Program safety methods during development create safety-by-design, which the voting system is reinforcing. An ultimate safety net is built on the obligation for the parameters to remain within a predetermined flight envelope. Every vector outside this flight envelope will be rejected and ultimately replaced by the nearest vector inside the flight envelope. The sizing of the flight envelope for commercial aircraft of course also responds to the safety-by-design requirement and leads to a restricted envelope in terms of what the aircraft is inherently capable of doing.
Another key method used in industry to ensure IT trustworthiness, is continuous software behavior monitoring. In such systems, one or two independent programs monitor the main one. Here, monitoring means being sure that the various tree branches of the program are well browsed at the right time, that key steps are properly crossed and that the final results remain within a range of values compatible with what is expected at this stage of the global behavior of the whole system.
In any case, we remain in a fully deterministic world, be it in terms of hardware or in terms of software. Less deterministic situations may occur when calculations are issued from highly chaotic partial differential equations, but, then again, voting systems, coupled with final crosschecks of the value range (keeping the output vector within a predetermined functioning envelope) keep things on the safe side. During his career, the author has never encountered a situation of intrinsic non-deterministic IT behavior. In such a case, he would never have allowed such a system to be deployed if connected to the physical world and even less have it certified.

1.7 Applicability of such securing patterns to AI systems connected to physical systems: Main principles of securing deep AI
We have seen that deep AI has intrinsically non-deterministic behavior. How can we secure such systems and ensure that their behavior will not harm surrounding humans, when they are connected to the physical world? A first, down-to-earth, approach would be to consider that non-deterministic behavioral systems must inevitably be monitored and potentially forced by deterministic systems. We could set this as the first rule of AI trustworthiness. It will be workable for all AI systems from which we expect known and more or less predictable behavior. It is indeed the key condition if we want to deploy monitoring systems to check if the proposed set of outputs is within an expected range of behavior (in the current conditions of work of the global system): the operational envelope (the term watchdog is not enough specific). Our analogy with flight control systems and the flight envelope is here very helpful. Here we will consider that the AI will have to remain within its “behavior envelope” (functioning envelope).
This method will hardly be applicable when it comes to AI systems whose behaviors are not really predictable or from which we would be tempted to accept “creative” or “previously unthought” behaviors. Note that this is already the case for AlphaGo: it has deeply creative Go strategy behaviors. Happily, AlphaGo is not connected to a physical system that poses potential threats to humans: it only places and shifts stones on a Go board.
The only way to avoid unexpected or harmful behavior will be to force the physical system connected to be monitored and forced by another deterministic system obeying a set of meta rules protecting humans. This is rather similar to Asimov’s well-known robotic laws, but they will need to be embedded within a non-AI system, “above” the AI system (in terms of monitoring, decision-making and control). Above all they will not be generic and will have to adapted to each and every case we will have to secure. This is probably the main difference with what Asimov imagined: it cannot be the deep AI which monitors itself. Only a deterministic system could properly control Asimov robots. Otherwise we open the door to the unknown, and the unknown is hardly acceptable when it comes to human safety.

1.8 Cybersecurity of AIs
If it has not already done so, cyber-crime will one day get its claws into AI. This subject deserves a specific article, considering that it is a brand-new field of research today. Like all IT systems, AI systems are and will be exposed to unwanted intrusion, alteration or modification. It will always be possible to add some new neurons whose behavior has been specifically triggered to ensure specific outputs.
In addition, deep AI raises new cyber security issues. Indeed, as it might continuously improve its behavior by training with new input data, the concept of a program signature to ensure the non-alteration of the code is no longer workable. What could then be the concept of non-alteration of AI? Research will probably have to delve into concepts of “neural architecture signature” to ensure that the inner structure of the various neural networks in an AI system have not been altered, independently from the value of the parameters of the synapses. Influencing AI behavior by submerging it with spoiled data will also be a new vector of attack. Someday, we could imagine that specific AI systems will be created only for the purpose of attacking other AI systems, continuously searching for ways to induce improper behavior in the targeted AI.

1.9 New Patterns for securing deep AI systems
Beyond that, what could be totally new patterns and approaches for securing deep AI systems in the future? Here again, we could imagine specific AI systems whose sole purpose would be to secure other AI systems. If they are built to have fully deterministic behavior, it would be probably a good way to advance AI trustworthiness.
Such a specific AI system would rely on the use of deterministic neural networks, ensuring that the AI parameters monitored and controlled remain within an accepted range.


Ensuring the safety and security of AI and more specifically what we have defined as deep AI, is a research field that is only dawning.
Nevertheless, AI development is blossoming worldwide and as we saw with the cybersecurity aspects of digital development in the 1990’s and 2000’s, the safety aspects of AI development are not really being considered today. For civilian nuclear power, the IAEA (International Atomic Energy Agency) draws up safety requirements and standards to ensure nuclear safety. Air Transport safety regulations and certifications are managed by various national and transnational bodies (FAA, EASA, ...). The digital world still has no transnational body to ensure digital safety, and AI is raising unprecedented issues.
When it will come to connecting complex AI and even deeper AI to systems connected to the physical world, the safety issues (and the resulting legal and regulatory ones) will suddenly be raised.
It is time right now to start considering what the main methods and rules should be to ensure AI trustworthiness.
The principles exposed here are a first attempt to set the grounding for this new area of research and development.


1 - Bostrom, N. - Superintelligence: Paths, dangers, strategies. Oxford, 2014
2 - "AlphaGo – Google DeepMind" - Retrieved 30 January 2016.
3 – Causal Entropic Forces – Alexander Wissner-Gross, C. E. Freer – Physical Review Letters
4 - A. A. Julius and A. D’Innocenzo, "Combining analytical technique and randomized algorithm in safety verification of stochastic hybrid systems," 2014 American Control Conference, Portland, OR, 2014, pp. 1438-1443.
5 – Determinism and indeterminism – Trick Slattery – http://breakingthefreewillillusion....

The Author: Eric Stefanello graduated from the French Ecole Polytechnique with a specialization in Theoretical Physics. He holds a post-graduate degree in AI and was among the first in France to study neural networks in the 1980s. During his professional career, he spent years securing and certifying IT programs in various domains including Armament systems, Nuclear Systems, Space and Aeronautics. He is also a specialist in cybersecurity and is the co-founder and President of Difenso, a cybersecurity company dedicated to data protection in the Cloud.

See previous articles


See next articles