Medical research: the Data Protection Authority's guidelines for the proper processing of personal data
One of the most complex issues following the full entry into force of the GDPR is undoubtedly that of scientific research in healthcare. There are several reasons for this.
The first, unquestionably, is a changed reality. The increasing digitalisation of our society, especially in the healthcare sector, leads to a hyper-production of data which opens the way to projects, initiatives and analysis possibilities that were unimaginable until a few years ago.
GDPR and the use of data collected for medical research
So, the questions and doubts that arise from all stakeholders - be they healthcare facilities, scientific societies, industry, universities, individual healthcare professionals - are quite varied: how can I use the data collected for diagnosis and treatment purposes for research? Can I provide third parties with data to train AI software? And in that case, whose results are those? How can I build a database of health data? Can I profile the data? How can I process the so-called ‘Real World Data’? When can I consider a piece of data as anonymous? And these are just a few.
This (no longer so) new factual situation is also embedded in a legal framework having a complex interpretation and application.
The arrival of the GDPR was hailed as a breath of fresh air towards scientific research. Art. 5(b) introduced the possible processing of data for further purposes (e.g. scientific research) when it cannot ‘be considered to be incompatible with the initial purposes’ (e.g. diagnosis and treatment). Art. 6(4) indicates the criteria for carrying out this compatibility test, in accordance with Opinion 3/2013 of the Article 29 Working Party. Art. 9(j) introduced a legal basis for scientific research, Art. 14(5)(b) relieved the data controller of the obligation to provide information on the processing where this ‘proves impossible or would involve a disproportionate effort’, and Art. 89 indicates the specific safeguards to be applied in scientific research.
The framework, however, remained somewhat unclear with regard to the health sector. In fact, the GDPR lawmaker did not have the courage (or more likely the political strength) to go all the way and in Article 9(4) left it to Member States to 'maintain or introduce further conditions, including limitations, with regard to the processing of genetic data, biometric data or data concerning health'.
In a way, it stirred things up without taking full responsibility, and this opening became a gateway that led Member States (some more than others) to introduce very different laws on the topic, often very stringent.
An excellent comparative analysis of these can be found in the Assessment of the EU Member States’ rules on health data in the light of GDPR of 2021, where the different frameworks of Member States and the difficulties of carrying out research at European level due to differences in data processing are highlighted. A Community response to this complex problem is now being sought through the Proposal for a Regulation on the European Health Data Space (published on 3 May 2022) on which the EDPS and EDPB have already commented in a specific Joint Opinion released on 12 July 2022.
As far as Italy is concerned, it can hardly be said that it did not (ab)use the space left by the GDPR for the introduction of specific rules for scientific research in the health sector (the aforementioned Article 9(4)).
While not wishing to deny the importance of protective rules in this area, the resulting Italian framework is, as many have said, undoubtedly very stringent as well as a fair mess.
The Leg. Decree 101/2018 harmonising the Italian Privacy Code with the GDPR, betraying the spirit of the GDPR itself, only superficially updated the existing Article 110 by introducing Article 110-bis, which originated from the modifications established by European Law 2017 (Law No. 167 of 20 November 2017).
Very briefly:
- Art. 110 concerns medical, biomedical and epidemiological research and (going beyond the EU provisions that mentioned only further conditions) imposes consent as a legal basis, establishing that where it is not possible to obtain the consent of the data subject, the opinion of the Ethics Committee and prior consultation of the Data Protection Authority (DPA) pursuant to Art. 36 GDPR are required (following an Impact Assessment under Art. 35 GDPR).
- Art. 110-bis, on the other hand, deals with the further processing by third parties for which the DPA's authorisation is required, to be issued within 45 days and whose lack of feedback is equivalent to a refusal. However, the current data protection system no longer provides for an authorisation regime on the part of the DPA, and all stakeholders are wondering whether or not the requirements set out in Provision 146 of 5 June 2019 (which brought the previous authorisations for the processing of sensitive data into the GDPR frame) can be considered a general authorisation, compliance with which would also allow processing by third parties.
In this highly complex and somewhat confusing framework, the Data Protection Authority recently intervened with an Opinion on Article 110 originating from a prior consultation (Register of Measures No. 238 of 30 June 2022). As it clarifies some important aspects, this Opinion definitely deserves to be analysed.
The case submitted to the Data Protection Authority
The University Hospital of Verona applied for a prior consultation of the DPA (under Art. 110(1) Privacy Code and Art. 36 GDPR) as promoter of a non-pharmacological interdepartmental observational study called "DB Torax" with both prospective and retrospective data.
In essence, the aim was to create a database of the population of patients with neoplastic (and non-neoplastic) diseases of the thoracic district, which could then be used for further studies to improve knowledge and clinical practice in the field of diseases of the thorax. More precisely, the study protocol specified that “plans for detailed statistical analyses will be set up in future research protocols that will use this database as a source of data in order to achieve the objectives of the specific studies”.
The hospital submitted, as an annex to the application, an impact assessment pursuant to Article 35 GDPR concerning the creation of the database and subsequent studies.
Regarding the legal basis, the Impact Assessment showed that:
- for prospective data, consent would be collected;
- for retrospective data, since many data referred to deceased persons and the collection of consent was very complex even for living persons (only 10% was available), the legal basis was to be drawn from the procedure in Art. 110 of the Privacy Code.
More precisely, the data processing for subsequent studies was considered 'further' and 'not incompatible' with the initial collection and thus no further legal basis was needed.
Also, the Impact Assessment went on to explain precisely which information sets would be collected, how the data would be pseudonymised, how in 20 years' time the data would be anonymised, and also the security measures implemented.
On this prospect, the Italian DPA issued its Opinion.
Medical research and data processing: the DPA's Opinion
As regards the legal basis of the processing, after a very precise and detailed reconstruction (quite useful given the complexities outlined above), the Data Protection Authority makes some interesting points.
More precisely, it specifies that the consent for prospective data and the procedure under Art.110 of the Privacy Code for retrospective data legitimises the processing of data for the purpose of building the database. On the contrary, subsequent studies, which will be carried out selecting data from the database, cannot be considered compatible with the initial processing and will therefore require specific consent. More precisely, it reads as follows:
‘It follows that the consents collected for the creation of the Torax DB (or, alternatively, the prior consultation procedure under consideration) cannot also constitute the legal basis for further processing, since they represent a partial manifestation of will that shall be progressively completed with the further and specific requests for consent that will have to be made by the hospital when carrying out future studies (Art. 6(4) GDPR and Guidelines 5/2020 on consent under Regulation (EU) 2016/679)’.
In essence, it envisages a 'general' consent (or procedure under Art. 110) for the establishment of the database and a specific 'progressive' consent (or procedure under Art. 110) for the individual following studies.
The analysis on the anonymisation of data is also very interesting. The DPA, in fact, considered the techniques presented to render the data anonymous to be valid:
- the deletion of 51 variables including those leading to the direct identification of the data subjects ('record_id' and 'patient code'), further variables that were excessive or likely to increase the risk of re-identification (e.g. date of birth, signature of informed consent), as well as those deemed ‘more useful for the organisational purposes of the study than for data analysis purposes';
- the randomisation of 57 variables, divided into 3 categories: age at enrolment, age at diagnosis, and ‘days for the remaining variables in the section covering all dates reported in the Torax database’;
- the generalisation of 293 variables by the means of aggregation and K-anonymisation by ensuring that each value relating to a subject is shared by at least a minimum number (k) of other persons within the set.
The DPA found the application of these techniques suitable for lowering the risk of re-identification to such an extent that the data can be considered anonymous.
Conclusions
After reading this Opinion, we tried to draw some conclusions from it.
The first is without doubt that when the Impact Assessment is properly done, the DPA is not an enemy but a consultative body, as it should be. As such, there is no reason to try to avoid applying Art. 110 of the Privacy Code (a request that we often receive). Instead, it more effective to seek to understand the system and try to work well within it.
The second conclusion, closely linked to the first, is that we may not like the Art. 110 and 110-bis system (as we said above), but this is what we have at the moment. Therefore, until a forthcoming intervention by the Italian lawmaker (or perhaps a change from the EU with the Health Data Space), the Italian DPA cannot go beyond the bounds imposed by the current law.
The third conclusion concerns anonymous data: many, especially public facilities, are still quite apprehensive about taking responsibility as to when to consider data as fully anonymous. We now have a clear and detailed example from the Data Protection Authority, so there are no more excuses.
Perhaps in this Opinion the Data Protection Authority could have been a little bolder about the 'compatibility' issue for the processing for subsequent studies, but the road is still uphill on this matter, also from a cultural point of view. This is also demonstrated by the constant postponement of a solution by the EDPB itself, which is still ongoing.
Nevertheless, we have now gained a clear document on how to build a health database. Given the interpretative chaos, the hesitations and fears of healthcare facilities, this is no small thing in our opinion.