Contribution of EEEI

On the context of this contribution

After setting out its political guidelines for “promoting European innovation”, the  Commission, under the aegis of its President, Mrs Ursula von der Leyen, published a White Paper on 20 February 2020 titled “Artificial Intelligence: a European approach based on excellence and trust”.

The European Expertise & Expert Institute (EEEI) wishes to respond to the consultation that the Commission has undertaken with the various stakeholders, by an analysis limited to the study of the subjects most likely to be closely linked to the role and the judicial or non-judicial functions of the Expert.

Indeed, the aim of the EEEI is to contribute, through its work, to the convergence of national systems of judicial expertise with a view to guaranteeing, throughout the European judicial area, the legal and technical security of judicial decisions through the quality of the expertise carried out.

EEEI brings together representatives of high courts, bar associations, expert companies, university professors and other personalities from most EU Member States concerned with these matters.

In fine, EEEI emphasizes that it is independent of any Public Authority as well as of any Private Organisation, and in particular it is independent of the Arborus Association, whose work will be mentioned below.
The present contribution was drafted by a working group set up by the EEEI, whose members are listed at the end of this document.

Introductory remarks

The adaptation of the global economy to the capabilities offered by digital tools in general and artificial intelligence (AI) systems in particular is transforming our societies in a profound but also insidious way by raising questions that bring into play the very foundations of our value system.

Indeed, the continuous increase in the networks size, the computing power of computers, the arrival of new technologies such as machine learning, the availability of large quantities of data accentuated by the Internet of Things, are accelerating the penetration of AI in sectors as diverse as health, defence, transport, education, energy, justice, etc…

In particular, the rapid progress made through self-learning allows a machine to build new knowledge from its own experience.

Nevertheless, the reliability of the algorithms that underlie the decisions made by these AI-based systems is often highly questionable (fragmentation, bias, discrimination, exclusion, etc.). By definition, an algorithm is only a model offering a simplified representation of reality.

Moreover, for “algorithms derived from the artificial neural network paradigm, in particular, from deep learning (i.e. comprising several linked layers of nodes that process different information and then aggregate them)” it is impossible for engineers to explain the result produced since, “unlike the logic-deductive reasoning of AI systems, it is not possible to extract a clear and coherent decision tree”. [1]

Therefore, how can a decision be accepted if it is not explicable, measurable and provable?

“The logic of the latter (the algorithms) tends to shift the decision making towards the technical stages of system design” [1]. This runs counter to the imperative of making algorithms more intelligible and transparent.

This leads us to imagine, following the example of what the European regulation has already provided for in terms of personal data protection (PDPR): “privacy by design”, that the governance of algorithms could be part of an “ethics by design” approach. It is therefore a management system that will have to be analysed in terms of ethics and not a finished product.

Remember that ethics is a desire for accountability that cannot be reduced to the mere expression of a conviction or an opinion. Furthermore, it should be noted that normative ethics, and in particular consequentialism, which is distinct from law and conformity, can sometimes fulfil a function aimed at assessing the correctness of the rules it lays down. Moreover, it can prefigure it, particularly in the area of risk assessment.

Concerning this assessment, there are already well-proven methods and tools available for the analysis of operational and risk control securities related to complex systems, of which IA is only a new construction model; with the reservation, however, that these methods and tools apply to “products” and not to “management systems” as is the case here.

Thus, we consider that the acceptability of AI-based systems, which will only take place under conditions that will ensure trust (I), presupposes, on the one hand, their rigorous evaluation, in particular by drawing up reference frameworks, charters, labels and certifications (II), and, on the other hand, the implementation of mechanisms likely to establish a real gradation of risks in order, if necessary, to provide irrefutable proof of the chain of responsibility (III).

I – Building trust

On mechanisms for trust building

The trust placed in a system by its users depends essentially on their perception of the designer’s control of the risks inherent in the system.

In order to increase this perception, the system provider can implement various commercial recognition processes:

  • Adhering spontaneously to a charter, which will most often incorporate the general principles of respect for fundamental rights, non-discrimination, quality and security, transparency and neutrality, user control and human control, respect for privacy, etc.
  • Obtain a label based on a reference system close to a charter, under the control of a body awarding this label according to criteria that are sometimes more or less transparent.
  • Obtain a certification, which will generally be based on a reference system of the same type as the previous ones, this time stamped by an official standardisation body such as ISO; this reference system, thus endowed with the status of a “voluntary standard”, then complements the legal standards that are laws and regulations; the “certifying” body, in order to be able to issue its certificate, will itself have to undergo an accreditation process, complying with another voluntary standard.

These standards and certifications obviously generate very lucrative activities for these training and certification bodies. The issue of confidence in the system is then replaced by that of confidence in the accreditation procedures of this system.

What level of confidence should be given to these charter, label and certification mechanisms? These mechanisms already offer only a very relative guarantee in technical matters. For example, the Boeing 737 Max was certified to fly. In terms of cybersecurity, ISO 27001-certified companies are not immune to data leaks and attacks by hackers… As regards ethical commitments, examples are not rare of non-compliance, or non-conformity by products that are nevertheless distributed by labelled or certified companies. This leads to scandals, sometimes on a European scale, such as the 2012 “breast implants” scandal, the 2013 “horse meat lasagne” scandal, the 2015 “diesel engines” scandal ; recalls and labels on Organic products are proliferating, not to mention conformity counterfeiting on products, when a label is affixed without any control.

When users, be they citizens or organisations, see their trust based on these processes betrayed, and this will also necessarily be the case for AI-based systems, they still can recourse to law and justice. Thus called in case of breach of trust, the legal and judicial organization will then itself have to demonstrate that it is trustworthy: it will have to guarantee to the litigants that, in case of failure of the commercial mechanisms of trust, the justice system will be able to determine responsibilities.

On the criteria applicable to AI-based systems?

Trust is always a prediction, a bet on the future with a real proportion of uncertainty.

As indicated in the “White Paper”, a “trustworthy” AI should be based on European values and rules.

European legislation on data protection, privacy or non-discrimination already applies to AI-based systems. Given their specificities, should complementary legislation be added, and which one?

Ethical charters, such as those of the CEPEJ [3], or of the Arborus [4] collective, provide recommendations to AI designers, developers and users.

While the commitment to comply with a charter is an element of trust, it does not constitute absolute certainty as to the absence of risk. The effective application of a charter is not always subject to control. The signing of the charter bears witness to an intentional approach, which very often only covers a marketing and image improvement operation.

In the same way, product and service labelling and certification processes, by providing “reasonable” assurance of compliance with a predefined set of requirements, will provide an element of confidence with regard to AI, as do certification systems in terms of quality, safety or environment.

The objective of these processes is to inspire confidence by having a third-party organization verify compliance with a set of requirements. Confidence in these recognition processes is thus inferred from the choice of the reference system of requirements and the credit given to the certifying and accrediting bodies.

Beyond the certification frame of reference, the attribution of the certificate requires the existence of an evaluation model which would define the following rules:

  • who is in charge of assessing conformity to the reference frame?
  • according to which process?
  • what evidence can be required and verified?

These are all questions that need to be answered before considering the certification of AI-based systems.

II – Reducing and controlling risks

It is now usual to consider trust as a ” risk-reduction mechanism “.

In order to reduce risks, they have to first be assessed. This assessment is contextual, not absolute, and will evolve over the lifetime of the AI-based system. The level of risk of an AI depends on multiple factors: the technologies implemented, the field of application, the users targeted and the company implementing the system, among other factors.

The “White Paper” identifies the potential risks of AI: lack of transparency of decision-making, discrimination on the basis of sex or other reasons, intrusion into privacy, criminal use.

A distinction is proposed between ordinary and high-risk IT systems, which require special protection measures. While this distinction is necessary, establish it is not always easy. To make it more objective, it would be reasonable to define a shared scale of risk levels. However, we find this exercise very difficult to achieve, because we cannot speak of risk in absolute terms, but rather of risk perception and sensitivity.

Users and providers of AI-based systems will each have their own perceptions and sensitivities, and only very few possibilities to be the same. It is also unlikely that perceptions and sensitivities of risk of two providers of AI-based systems, that are functionally similar, will be the same.

AI-based systems are very complex, as they are often systems that themselves include AI-based systems and the risks analysis of such systems is an ongoing task. Therefore, we believe that the risk level of an AI-based system should be assessed before its design (by design) and reassessed at regular intervals. Ethical labels and certifications of AI-based systems should necessarily be based on an identification of risks, their regular re-evaluation and a control of the effective implementation of risk reduction actions. Remind that risk assessment is only a first step in risk management, which should be followed by a step to reduce either the occurrence and/or the impact.

It seems to us that the risk analysis and management methodologies widely used in industry, such as FMECA, can be usefully applied to AI-based systems by adapting them to the context and transposing the terminology to the specific failure modes of Artificial Intelligence. Lastly, labels and certifications are more precise than simple charters and they can help define common rules to promote the emergence of fair competition between AI-based system designers for the benefit of their users.

III – Providing proof

However, these labelling and certification systems must be based on a sufficiently solid and trustworthy system of proof.

In order to address the problem of proof, we started with the Arborus Charter on Artificial Intelligence [4]. Designed to create a framework of trust with regard to artificial intelligence, this charter essentially seeks to prevent potential biases with regard to gender equality.

“It is a reference document both for Tech companies and for all those implementing AI to respect diversity by ensuring that the entire data value chain is responsible and that discriminatory biases are identified and controlled. » [4]

Our objective here is to highlight, in a concrete way, the means required to search for evidence in relation to allegations, and to underline the difficulties of a rigorous evaluation of the respect of these allegations. We are in no way seeking to criticise the Arborus initiative, which can only have positive effects in the pursuit of its objective.

Even if this charter does not cover all the areas addressed by the other current initiatives for the labeling or certification of an ethical and trustworthy AI, it appears to be very representative of the problems that certifiers, as well as experts having to seek liability after a disaster, may encounter.

Let us examine one after the other the 7 commitments made by the signatories of this charter and ask ourselves how to demonstrate to an independent third party that each commitment has been fulfilled. This exercise is not exhaustive, it is just illustrative:

1. Promote diversity and diversity in the teams working on AI-based solutions.

As evidence of this first commitment, one can imagine :

  • Look at the result: count the teams working on AI and compare the ratio of women / men to the same ratio for the rest of the company?
  • Looking at the “promotion” work that will have been done, then it is a question of evaluating the means deployed?
  • Collect these figures over time to evaluate the effectiveness of the means deployed?

We have only looked at the ratio of women to men, the objective here being diversity, we should probably go further in the analysis? We could also imagine the collection of more micro indicators, make analyses by site instead of taking into account the whole company, or use strict or broad definitions of the notion of “teams working on AI” …

2. Organize to assess and respond to all forms of discrimination that could result from biased or stereotyped data.

With regard to this second commitment, one may wonder about the definition of the occurrence of discrimination: is it an isolated particular case? or is there a sufficient number of cases to declare that discrimination has occurred?

Once this has been defined, one could then look for evidence of the existence of a procedure for assessing and reacting in the event of discrimination. This would be indirect evidence. To go further, one would have to ask the question about the materiality of this procedure: Is it documented? Is it known? By how many people in the teams? Is it applied? Is its application traced? Has it already been used? What are the results of its use both in terms of evaluation and response? Has it been modified over time to adapt it to the realities encountered? What lessons have been learned?

3. Ensure the quality of the data used to guarantee the most equitable systems possible: unified, consistent, verified, traceable and usable data.

We assume that this is the data used to develop the algorithms. It is therefore learning data (in the case of machine learning) and test data. Qualifiers for these data are required in large numbers here.

The first two “unified, coherent” need to be explained in order to know how to check these points. The next one “verified”, against which reference frame?

We understand traceability as a conclusive archiving of learning and test data. To verify this point, it would be necessary to make sure that these data have been backed up and that the backup that is presented is indeed that of the data that were used to develop the algorithm. As there are usually several successive versions, one should be able to trace these data for each version. And in the event of a post-evidence search for evidence, it should be ensured that the operational version at the time of the incident is the version that is presented to an external “auditor”.

The last qualifier, “exploitable”, also requires an explanation of how to verify its application.

4. Training to raise awareness and empower designers, developers and all actors involved in the AI industry, to stereotypes, biases that can generate discrimination.

This is a training objective.

Evidence could therefore be sought in terms of the implementation of means; for example, the number of hours of training on this theme per person concerned. This indicator can be refined by distinguishing different categories of the population concerned.

One could also look for evidence of results: how did the training affect the people concerned? Were there student evaluations at the end of each training session? What are the results: randomly interviewing people who have taken the training to check their level of knowledge?

5. Raise awareness among those who prescribe AI-based solutions (HR, finance, customer relations, marketing) of the risks of bias and stereotypes that can generate discrimination and include checkpoints and iterative evaluation points in the specifications.

Here again, we are in a logic of training/awareness, the same remarks as for the previous paragraph apply with both logics: what are the means implemented, and then we should ask the question, are they sufficient? adapted? and a logic of results, which could be measured in different ways, on the level of awareness of the prescribers and on the contents of the specifications. With regard to the latter, we could, here again, measure mainly intentions.

If one wanted to go further, one would have to test the checkpoints being developed for the system and the content of the evaluations. If one wanted to be complete, one would also have to follow the evolution of the specifications being developed. If we are in an agile logic, where documentation on developments is weak, it would be necessary to track all iterations.

Being exhaustive may require considerable effort, even with the active contribution of the audited organization.

6. Ensure that suppliers are carefully selected and evaluated iteratively to ensure that the entire AI value chain is non-discriminatory.

Here, we would like to point out that most systems currently under development and many systems in use are in fact already combinations of AI-based systems. That is, AI-based system providers are actually assembling, integrating.

Some systems with a broad purpose are standardised and off-the-shelf (e.g. voice recognition), others need to be developed more specifically. Some operate in the cloud, others at the user’s premises or those of the provider offering the service to the public.

Faced with such diversity and variety of situations, how will purchasing managers be able to evaluate their suppliers, spread across different parts of the world and whose products are constantly evolving? How will they look for evidence after the fact? And how will the certifying body be able to attest to this? Above all, problems can arise from the simple integration of these systems. System A works well, system B works well, connecting system A to B causes problems. This kind of situation is classic in computer science, but the impact with neuron-based systems can be very important.

7. Monitor AI-based solutions and continuously adapt processes.

Test data is there to control and validate AI-based solutions. This is true for any software that is released on the market.

Is there, however, only one software on the market that is functionally flawless and secure? Unfortunately, the answer is no.

It is usually the users who report functional flaws and hackers, ethical or otherwise, who reveal security holes. AI-based systems cannot be otherwise.

It would certainly be easier to evaluate the reactions to fault escalations: Check if there is a process? if it is documented? known to the people who have to implement it? Whether the reactions have been successful in closing the gaps? Count the frequencies of occurrence of faults? The proportionality and adaptation of the reactions to the problems encountered?

Conclusion: “Science without consciousness is only the ruin of the soul” [5]

Digital Europe, trust, its levers.

In February 2020, the European Commission proposed its vision of a digital Europe based on the development of reliable technologies to promote an open and democratic society and a dynamic and sustainable economy. On that occasion, Mrs Ursula von der Leyen, expressed herself as follows:

“I want this digital Europe to reflect the best of our continent: openness, fairness, diversity, democracy and trust”.

Like any man-made tool or technique, the machine must remain at the service of Man.

Professor P. Moati, agrégé in economics, founder of the “Labo de la confiance” at the University of Paris-Diderot [6], is a recognised observer of confidence-building mechanisms. Drawing on recent academic literature, he identifies the following three constituent factors: competence, integrity and benevolence.

Measuring, objectively, the presence of the last two criteria, which have a moral dimension, in an AI system provider is a very difficult, if not impossible, exercise. Only time will show whether the company’s behaviour remains loyal to its customers.

It remains possible to objectify the first criterion, competence. This involves certification of the company’s resources, processes or products, testing by trusted third parties, and long-term commitments such as written guarantees or continuity of service.

The European Commission’s presentation specified that “in high-risk areas, such as health, police or transport, AI systems should be transparent, traceable and guarantee human control. Authorities should be able to test and certify the data used by the algorithms, just as they carry out checks on cosmetics, cars or toys. »

The main purpose of this contribution is to show that, in our view, obtaining evidence to certify that a finished product conforms to a standard is a well-known exercise. But this has very little to do with obtaining evidence that an AI-based system is working well, will work well in the future and has worked well in the past, if liability is sought.

“The legal code is not intended to be subject to the computer code” [7].

Information systems are not foolproof either functionally or in terms of security. AI-based systems will be no exception. Therefore, in absolute terms, one cannot consider a risk classification or scale of risks as an achievable goal. Nevertheless, such a classification is necessary and would allow players in the field to share common reading and assessment benchmarks.

Making the provider accountable for the implementation of an ethical and responsible approach, under the supervision of an IA supervisory authority, as was planned for the RGPD, then seems to us to be a path that needs to be explored.

Members of the EEEI Working Group

  • Benoit de Clerck, Legal Expert in computer science,
  • Alice Louis, Information Heritage Governance Consultant, Director of the Cyber Ethical Fund for Digital Sovereignty project,
  • Martine Otter, Legal Expert in Computer Science,
  • Eric Parize, Legal Expert in computer science.

Bibliographic table

  1. Jocelyne Maclure and Marie-Noelle Saint Pierre, ” Le nouvel âge de l’intelligence artificielle : une synthèse des enjeux éthiques “, Les cahiers de la Propriété intellectuelle vol 30, p 758, le 19/09/19, available on :
    http://www.ethique.gouv.qc.ca/fr/assets/documents/CPI_Maclure_Saint-Pierre.pdf
  2. CNIL, “Comment permettre à l’homme de garder la main? Les enjeux éthiques des algorithmes et de l’intelligence artificielle” December 2017, available on :
    https://www.cnil.fr/sites/default/files/atoms/files/cnil_rapport_garder_la_main_web.pdf#page=60
  3. Council of Europe, CEPEJ, European Commission for the Efficiency of Justice (CEPEJ)
    European Ethical Charter for the Use of Artificial Intelligence in Judicial Systems, Sept. 2019, available at: https://rm.coe.int/charte-ethique-fr-pour-publication4-decembre-2018/16808f699b
  4. ARBORUS Association, a collective of European companies under the patronage of the European Economic and Social Council, directed towards the promotion of equality between women and men, first international charter for an inclusive AI, April 2020, available on: https://charteia.arborus.org/
  5. François Rabelais, Pantagruel, 1534; Gargantua gives his son Pantagruel his curriculum.
  6. Philippe Moati, Associate Professor of Economics at the University of Paris Diderot, Founder and co-president of the association L’ObSoCo, La confiance : fondements et enjeux, available at: https://kuryo.typepad.com/lelabodelaconfiance/2011/11/la-confiance-dans-la-marquefondements-et-enjeux-philippe-moati.html
  7. Gaëtan Guerlin, Artificial Intelligence, 2019, Édition Grand Angle, Dalloz, p 49

To participate in the Institute’s IA working group, join the LinkedIN  group led by Robert Ranquet.