AI and data accuracy: is the legal framework adequate?


Among the most discussed issues in the field of Artificial Intelligence in healthcare is the so-called ‘data accuracy’. But what exactly is meant by this expression?

When we talk about data accuracy in AI systems, we go far beyond the notion of ‘precision’ and ‘updating’ of data: we also refer to the accuracy of the "statistical modelling” of the software. This means evaluating how many times an AI system guesses the answer that is deemed correct in relation to the data entered and the processing system of the AI software itself.

In essence, the notion of data accuracy in AI systems concerns not only the data entered, but also the logic and functioning of the AI software, as well as the final output of this processing[1].

The relevance of this process where AI is used in healthcare is evident: the accuracy of the response provided by the software can directly affect the patient or the decision that the doctor makes about the patient, in both cases affecting their health. Hence, the resulting liability issues that may arise.

As legal experts it is necessary to consider whether our legal system currently provides a sufficient legal framework to regulate and ensure that AI systems comply with the principles of ‘data accuracy’ in the sense outlined above.

It is the opinion of the writer that - even if complex and improvable - the current legal framework provides regulatory tools that are reasonably adequate to ensure the accuracy of AI systems in health care.

The answers in fact should be sought in EU Reg. 2016/679 (so-called GDPR) on data protection and in the new EU Reg. 2017/745 (so-called MDR) on medical devices.

Let’s delve further into the topic.

Data Accuracy in the GDPR

The GDPR aims not only to standardize Member States' data protection legislation, but also to provide answers in a highly innovative technological environment. The GDPR itself, in the Recitals, operates a sort of disclosure recognizing that "Rapid technological developments and globalisation have brought new challenges for the protection of personal data" (Recital 6) and that "Those developments require a strong and more coherent data protection framework" (Recital 7).

The GDPR is therefore not a closed and finite system, but a system of provisions that interact with one other: they are partly specific and binding, and partly programmatic and to be used as future guidance.

Thanks to the flexibility of the GDPR and to an integrated reading of the provisions, the Regulation can prove to be able to provide the necessary tools to obtain solutions to new problems, which did not exist at the time when the Regulation was drafted or which appeared to be distant at that time.

In order to do this, the principles that guide its normative structure should be read as a value chain to which every other provision must be firmly linked. This way of reading provisions allows to build a dynamic regulatory framework which enables the user to build a path through various degrees of unexplored terrain.

In this sense the GDPR undoubtedly offers a comprehensive framework that, in the writer's opinion, proves able to successfully address the requirement of data accuracy as well as ethical concerns.

The cornerstone of the entire GDPR is Article 5 on the principles of processing. This article provides that personal data must be processed in a ‘fair’ and ‘transparent’ manner in relation to the data subject (Art. 5 letter a) and that it must be ‘accurate’ and ‘kept up to date’ (Art. 5 letter d).

In Artificial Intelligence, these principles - fairness, transparency and accuracy of data - should be read in close connection with each other.

The principle of accuracy requires (in general) that every data processed is accurate and that all reasonable and necessary measures are taken to rectify inaccurate data.

In the field of AI, the accuracy of data must be seen both as an early requirement and as a final objective, so that it represents the core characteristic of the processing activities. In fact, the "intelligence" of a machine depends on the information that is supplied and that the AI processes. Thus, it is evident that if the machine is supplied with incorrect or inaccurate data, it will return inaccurate results. This is both in the case in which the system returns data through a plain input – output process, and even more so where the system is able to learn and evolve based on the data it knows and processes.

The principle of fairness, on the other hand, involves the processing itself. And compliance with this principle - in its interpretative scope - ultimately includes ethical aspects as well.

In this respect, the ICO - Information Commissioner's Office – appropriately states the following in the ‘Guidance on AI and data protection in the section on ‘How do the principles of lawfulness, fairness and transparency apply to AI?’:

…. if you use an AI system to infer data about people, in order for this processing to be fair, you need to ensure that:

  • the system is sufficiently statistically accurate and avoids discrimination; and
  • you consider the impact of individuals’ reasonable expectations.

The two references to avoiding discrimination and meeting the ‘reasonable expectations’ of the data subject seem to encompass many of the aspects related to the ethicality of the software[2]. It follows that complying with the principle of fairness of data referred to in Art. 5 GDPR implies - in essence - the respect of the principles of ethicality of the processing.

Therefore, the whole process must be transparent, in the sense that the data subject should be able to understand how the data and information are processed. As a matter of fact, only what is transparent is assessable and therefore subject to trust by the data subject and, eventually, by the community.

This last aspect is undoubtedly one of the most challenging[4].

The set of principles expressed in the GDPR, which should guide the development of artificial intelligence, seem therefore to indicate a coherent path, even if its complex practical application is not denied.

In addition, the entire process - not only the design of the AI system but also its use - must be carried out under another core norm of the GDPR: Art. 25.

This provision lays down the respect of the principles of privacy and data protection by design and by default.

That is to say that data controllers and innovators, even before they start the processing and already at the design stage of the system, should define the purposes of the processing, identifying the data needed to achieve these objectives, establishing which data are essential and envisaging the functioning of the system, as well as defining rules of conduct for the person operating on it.

It follows that the concepts of privacy by design and by default precede and support, in conceptual and time-related terms, the respect of the principles of accuracy, fairness and transparency.

The combination of the principles of privacy by design and by default, the accuracy of input data, the fairness of the processing method and transparency towards the data subject allows to obtain quality output data.

In a broad sense, quality can be seen as the characteristic of the data processed by artificial intelligence that is fully compliant with every principle and objective of the Regulation. Quality thus becomes the benchmark, used by data controllers and innovators, to measure the correct functioning of machines that employ artificial intelligence.

Data Accuracy in the MDR

As regards AI in health care, the above-mentioned legal framework intersects with (or rather completes) the provisions of the new EU Regulation 2017/745 on medical devices (so-called MDR).

First of all, it should be pointed out that AI software must comply with the MDR when it falls under the definition of medical device under Art. 1 par. 1 MDR (as it can act directly on humans) or under the definition of ‘accessory for a medical device’ under Art. 2 par. 2 MDR (as it assists the doctor in their decision-making process). With regard to the aspects that are relevant in this article, the MDR is equally applicable to both instances.

Given the above, it should also be noted that AI software can be marketed only if it complies with the General Safety and Performance Requirements set out in Annex I (Art. 5 par. 2). The conformity to these requirements must also be proven and substantiated through a specific clinical evaluation (art. 5 par. 3).

On this point, it is also important to highlight the fact that the requirements set forth in Annex I are not only safety but also performance requirements and that clinical performance is defined as follows in Art. 2, par. 52:

‘clinical performance’ means the ability of a device, resulting from any direct or indirect medical effects which stem from its technical or functional characteristics, including diagnostic characteristics, to achieve its intended purpose as claimed by the manufacturer, thereby leading to a clinical benefit for patients, when used as intended by the manufacturer;

Therefore, in accordance with the MDR, AI software can be marketed only if the manufacturer is able to demonstrate its clinical benefit: i.e. the clinically correct response resulting from the output of the data entered and processed according to the designed statistical modelling.

Clinical performance - and clinical benefit - must be proven through a specific clinical evaluation (Art. 61 and Annex XIV of MDR) that for software must follow the specifications of the document recently issued by the Medical Device Coordination Group, the ‘MDCG 2020-1 Guidance on Clinical Evaluation (MDR)/ Performance Evaluation (IVDR) of Medical Device Software’.


Based on the legal framework outlined above, it follows that AI software must comply with both GDPR and MDR rules.

The combined application of the two disciplines does not result only in the obligation to ensure data accuracy (in the broad sense used in this article) but also a clinical benefit for the patient, through the output of AI software.

It is therefore the opinion of the writer that today the theoretical legal framework is sufficient and adequate to guide AI designers and manufacturers and also to build trust - in the medical field as well as in patients - and allow the development of systems.

However, it is clear and undeniable that since this is a new, complex and multidisciplinary subject, its practical application requires study and specific expertise.


[1] On this point, the distinction that the ICO operates on its website in the section ‘What is the difference between ‘accuracy’ in data protection law and ‘statistical accuracy’ in AI is interesting. ( )

[2]  On the principles of ethics in the AI please refer to the “Ethics guidelines for trustworthy AI” of the HIGH-LEVEL EXPERT GROUP ON ARTIFICIAL INTELLIGENCE within the EU Commission

[3] To ensure ethical processing, explainability of AI systems is essential. On this point, see guidance “Explaining decisions made with AI” issued by ICO (

and the recent guidance “Four Principles of Explainable Artificial Intelligence” issued by NISTR in August 2020. (