EDPB Guidelines 1/2026 on the processing of personal data for scientific research purposes: a document that unravels many interpretative knots

21/04/2026

On 15 April 2026, the European Data Protection Board (EDPB) adopted Guidelines 1/2026 on the processing of personal data for scientific research purposes, which are now open for public consultation until 25 June 2026.

This is the most significant document on data protection for scientific research since 2018.

The Guidelines systematically address key challenges that have significantly affected European and Italian scientific research.

The guidelines cover many topics: defining scientific research, the presumption of compatibility under Article 5(1)(b) GDPR, rules for further processing, broad consent, legal grounds for private companies, rules on sensitive data, and issues with anonymisation and pseudonymisation.

The document has 171 paragraphs in eight sections (67 pages).

This overview highlights the most innovative and impactful points for healthcare, life sciences, and digital health operators, rather than a comprehensive review. 

  1. The notion of “scientific research”

The GDPR does not define scientific research.

Recital 159 refers to it in broad terms (“technological development and demonstration, fundamental research, applied research and privately funded research”), but does not provide operational criteria for the controller to use.

Hence, a recurring qualification problem: when is a data analysis activity “scientific research” within the meaning of the GDPR — and therefore benefits from the favourable provisions scattered throughout the Regulation — and when is it not?

The EDPB solves this problem. Using its own Guidelines 5/2020 and sources like the OECD Frascati Manual, the UN health data recommendation, and the European Code of Conduct for Research Integrity, it sets out six main indicators in paragraph 11: If an activity meets all of them, it counts as scientific research. If one is missing, the controller must explain why it still qualifies as scientific research.

The six factors are as follows:

  1. a systematic approach following the methodology of the relevant field, with a complete research plan and a stated hypothesis or objective;
  2. compliance with the ethical standards of the relevant sector;
  3. verifiability and transparency, through the sharing of results — by way of publication or other form of dissemination, including in the future — without prejudice to legitimate limitations linked to intellectual property and trade secrets;
  4. autonomy and independence of the research team, which holds academic or scientific qualifications in the relevant field;
  5. objectives oriented towards the growth of general knowledge and the well-being of society — and the EDPB clarifies that this “does not exclude that the research may also aim to further commercial interests”;
  6. the potential to contribute to existing scientific knowledge or to apply it in innovative ways, assessable by independent experts or committees.

Verifiability and transparency are especially important: the EDPB says that sharing results with others—by publishing or otherwise disseminating them, now or in the future—demonstrates that the work is scientific. Of course, there can be limits to sharing to protect intellectual property or trade secrets. Making results public is not optional but is a key part of scientific research. If someone analyses personal data only for internal reasons, with no intention—now or later—of letting the scientific community review the results, that does not count as scientific research under the GDPR.

This point is very important for many business intelligence projects that call themselves "research", as well as for things like post-market surveillance of medical devices, which may not fit the meaning of scientific research. 

  1. Further processing of data

Here, too, finally, some clarification. Section 3 of the Guidelines systematically addresses one of the most complex interpretative knots of the GDPR: the relationship between the presumption of compatibility of purposes and the principle of lawfulness of processing where personal data are processed — after their collection — for scientific research purposes.

More specifically, at paragraph 17, the EDPB identifies three scenarios:

  • the controller collects personal data in order to process it for specific scientific research purposes: those purposes are considered the primary purpose of the processing;
  • the controller has collected the data for specific scientific research purposes and subsequently intends to process them for another scientific research purpose not envisaged by the initial purpose: this is further processing for another (research) purpose;
  • the controller collects and processes the data for non-scientific purposes, but subsequently decides to process them for scientific research purposes: this is further processing for scientific research purposes of data initially collected for another purpose (e.g. collected for diagnosis and treatment and then processed for research).

In relation to these three scenarios, the Board refers to the presumption of compatibility under Article 5(1)(b) GDPR, stating that “in the case of further processing of personal data for scientific research purposes, it is not necessary to carry out the compatibility test under Article 6(4) of the GDPR” (paragraph 19).

Such compatibility, however, does not remove the need for an autonomous legal basis.

The EDPB, in fact, states that: “The question of the compatibility of purposes should not be confused with the principle of lawfulness” (paragraph 21, which refers to Joint EDPB-EDPS Opinion 2/2026). Consequently, “the possibility of relying on the legal basis on which the initial processing was based requires a lawfulness assessment in order to determine whether that legal basis is also suitable for the subsequent processing of personal data for scientific research purposes”.

In plain terms: once compatibility is presumed, the controller's obligation to identify a suitable legal basis under Article 6(1) (and, for special categories, a suitable derogation under Article 9(2)) for the new processing remains in place.

On this point, at paragraph 22, the EDPB clarifies that in many cases the controller will be able to rely on the same legal basis used for the initial processing operation: this occurs “where a controller has initially processed personal data on the basis of a public or legitimate interest”, legal bases which are structurally compatible with the evolution of the purpose over time.

By contrast, it is not always possible to rely on the same legal basis as the initial processing, whether consent or a legal obligation. The reason is structural: a change of purposes may exceed the specificity of the consent given — which covers only the operations and the purposes indicated in the declaration — or the scope of the statutory obligation. Taking account, however, of the opening towards broad consent, the need to qualify the subsequent processing as “further” processing arises in less pressing terms.

The presumption of compatibility also applies to special categories of data, but “the controller must in any event assess which derogation from the prohibition on processing special categories of personal data, under Article 9(2) of the GDPR, applies in the case of subsequent processing of such data for scientific research purposes”. In essence, here too, the controller who intends to rely on the same derogation used for the initial processing must verify its suitability for the new processing.

Finally, paragraph 26 clarifies what is to be done where controller A transmits personal data to controller B, who intends to process it for its own scientific research purposes.

In relation to that scenario, the EDPB clarifies two aspects: (a) the transmission by controller A constitutes further processing where controller B's research purpose was not among the primary purposes at the time of collection; (b) neither party is required to perform the compatibility test, but each must autonomously determine an adequate legal basis under Article 6(1), identify any applicable derogation under Article 9(2), and verify the additional conditions or limitations, if any, imposed under Article 9(4).

In sum: A may transmit to B, but each must have its own legal basis. 

  1. Legal bases for processing for scientific research purposes

We now turn to several highly significant developments.

3.1. Broad consent

The possibility of collecting consent by research area has finally been officially recognised.

At paragraph 40, the EDPB acknowledges that “in the field of scientific research, controllers may rely on the data subject's consent to collect and process personal data in a specific area of scientific research, where the purposes of the research are not fully known at the time of collection of the data”. The door is thus opened to broad consent, a concept long established in international literature but which has received a very restrictive reading in Italy.

According to the EDPB, broad consent rests on two pillars: the first is the delimitation of the research area (paragraph 45): the purpose may be circumscribed to a disciplinary field (for example, oncology, criminology, medical genetics) or defined by reference to the expected results (for example, the development of new therapeutic methods); the second pillar is the adoption of additional safeguards under Article 89(1) GDPR, referred to in paragraphs 48-50: detailed and up-to-date information made available to the data subject (dedicated webpage, newsletter, individual communications), measures for controlling use and access (independent data administrator, limited temporal validity of the consent, independent oversight body), technical tools — privacy dashboards, consent receipts — enabling the data subject to exercise their freedom of choice with regard to individual projects.

The alternative is dynamic consent: the controller asks the data subject to consent separately to each individual project (or part thereof) as soon as the purposes become known (paragraph 41).

The two approaches are not mutually exclusive: they may be combined.

The choice must be documented before the processing, in line with the accountability principle (paragraph 42), and is particularly suited to long-term projects in which the researchers have a close and continuing relationship with the participants.

3.2. Public interest and exercise of official authority (Article 6(1)(e) GDPR)

In the public interest, there is an important clarification for private entities.

At paragraph 58, the EDPB states that “reliance on the performance of a task carried out in the public interest or in the exercise of official authority as a legal basis is not limited to public bodies carrying out scientific research. Private entities may also rely on that legal basis, where the legal act in question concerns their activities”.

This provision (anticipated in Italy by Article 8 of Law No. 132/2025 on AI) allows the private entity that cooperates with the public administration and that in some way falls within the performance of an activity of public relevance — possibly by virtue of a general administrative act under Article 2-sexies of the Italian Data Protection Code — to process data with full legitimacy on the basis of the public interest.

3.3. Legitimate interest (Article 6(1)(f) GDPR)

Paragraph 61 states that: “scientific research and the processing of personal data connected thereto may constitute a legitimate interest within the meaning of Article 6(1)(f) of the GDPR, regardless of whether they are carried out for non-profit or commercial purposes”.

The commercial nature of the research is not, in itself, an obstacle to the application of that legal basis.

The EDPB adds that scientific research is “generally attributed particular importance as an activity beneficial to society”. Consequently, in the balancing test under Article 6(1)(f), the controller “may often attach significant weight to the processing of personal data for scientific research purposes, as against the interests or the fundamental rights and freedoms of the data subjects”. This is an indication which — if correctly applied — tends to facilitate research initiatives based on legitimate interest, including those undertaken by private entities.

The mandatory steps of the balancing test outlined in Guidelines 1/2024 remain in place: identification of a legitimate interest, the necessity of the processing, and the prevalence of the interest over fundamental rights and freedoms.

3.4. Processing of special categories of personal data (Article 9 GDPR)

The EDPB devotes Section 4.4 of the Guidelines to the derogations relevant to scientific research, analysing three pathways: explicit consent (point (a)), data manifestly made public by the data subject (point (e)), and derogations provided for by Union or Member State law (points (g), (i), (j)).

As regards explicit consent under Article 9(2)(a) — which may be broad or dynamic (paragraph 67) — the considerations already developed above on consent in general apply, with the further specification that the manifestation must be active and unambiguous, which rules out implicit or inferred forms.

As regards data manifestly made public under Article 9(2)(e), the EDPB adopts a restrictive interpretation. The term “manifestly” implies a high threshold and requires the data subject to take “a clear affirmative action” to make the data accessible to the public (paragraph 69). A social media post with default privacy settings does not meet the requirement: the controller must check the platform's settings and verify that the publication took place as the result of an active choice, in awareness of its public dimension. Context is also relevant: special categories actively published by the data subject themself (an influencer, a videoblogger) may meet the requirement, whereas publication by third parties (for example, photographs published by a friend or relative) does not. Finally, at paragraph 72, the EDPB reiterates that the fact that a data subject has made special categories of data public at a given moment does not mean that they have made public other information within the same category, on the basis of the CJEU judgment in Schrems v Meta Platforms Ireland (Case C-446/21).

As regards the national or Union derogations provided for in Article 9(2)(g), (i) and (j), the EDPB reiterates the systemic requirements: the legal measure must “respect the essence of fundamental rights and observe the principle of proportionality” (paragraph 73) and provide for appropriate and specific measures to safeguard the fundamental rights and interests of the data subject. In footnote 115, the EDPB includes, among the examples of relevant national and Union derogations, Article 110 of the Italian Data Protection Code and Article 57(1)(a)(i) of Regulation (EU) 2025/327 on the European Health Data Space (EHDS). In the subsequent paragraph 75, moreover, the EDPB specifically indicates Article 53(1)(e) EHDS as a derogation under Article 9(2) available to controllers processing health data for scientific research on the basis of legitimate interest (Article 6(1)(f)). 

  1. Research databases and research infrastructures

We now turn to a new topic: research databases and research infrastructures.

The EDPB Guidelines provide a series of indications — somewhat scattered throughout the document — which, when taken together, offer the operational framework for so-called health databases. In Italy, we needed this because there is genuinely some degree of confusion.

The EDPB Guidelines 1/2026 on the processing of personal data for scientific research purposes (adopted on 15 April 2026, public consultation version) devote a dedicated section (Section 13) to research data infrastructures, together with three key examples — Example 8, 22 and 25 — which frame the architecture for biobanks, registries, hospital data repositories and federated consortium databases.

Paragraph 13 of the Guidelines defines the research data infrastructure as a repository or database intended to make data available for future research projects in a specific area. For the processing to fall within the GDPR's notion of research, the controller must assess the operations involved in managing the infrastructure against the six indicative factors set out above. The operator (controller) must also set the criteria for data collection, for the admission of projects, for the scientific qualification of the researchers and, for special categories, for adherence to ethical standards.

For databases, the Guidelines confirm broad consent as the typical instrument for feeding the database where the future purposes cannot be determined at the time of collection (paragraphs 43-47). Paragraph 47 goes on to clarify that broad consent covers the collection, curation, storage and making available of the data — including through a secure processing environment — as well as subsequent processing operations within individual projects, provided that these remain within the communicated research area and the reasonable expectations of the data subject.

Paragraph 43 — read in conjunction with Recital 33 GDPR and Recital 26 of Regulation (EU) 2022/868 (Data Governance Act) — accepts that several controllers may rely on the same broad consent. In the writer's view, this is the key to networked databases between healthcare facilities, universities and industrial sponsors.

To offset the reduced specificity, paragraph 49 requires additional safeguards: a dedicated webpage, a newsletter, an independent data trustee, time-limited consent, and an oversight body with representatives of data subjects, experts, and the DPO. Where future projects are foreseeable, dynamic consent should be used.

The Guidelines also provide illustrative scenarios that explain how this works in practice.

For example, the public university that sets up a health database by collecting pseudonymised data from hospitals is an autonomous controller for the management of the database; the pharmaceutical company accessing the database for its own R&D is an autonomous controller of its own processing: two controllers, two legal bases, two privacy notices. Another example: a research consortium with a common protocol and a federated database; in this case, all partners (pharma, universities, research institutes) are joint controllers under Article 26 of the GDPR for processing in the database and for the observational study. Hospitals, on the other hand, remain autonomous controllers for care purposes.

So, if (as I believe) these prescriptions are confirmed, what needs to be done?

In my view, those managing biobanks, registries, hospital data repositories or federated databases would do well to start considering the following aspects: verifying the qualification of the infrastructure according to the criteria set out above; revising privacy notices and broad consent forms and adapting them to the research area; reviewing the agreements between sponsors, universities, healthcare facilities and CROs, reassessing the various controller/processor roles. 

  1. Anonymisation and pseudonymisation

We now come to the topic of anonymisation and pseudonymisation.

Here too, the Guidelines provide rules that genuinely clarify many aspects that have always been a source of doubts and divergent positions.

Starting from Article 89 GDPR, according to which, where the scientific research purposes “can be fulfilled by further processing which does not permit or no longer permits the identification of data subjects, those purposes shall be fulfilled in that manner”, the EDPB states that, where possible, the data must be anonymised.

Where anonymisation is not possible — because, for example, the controller seeks to analyse developments in respect of an individual over time, which requires an identifier, or because anonymisation is not technically feasible — the data must be pseudonymised (paragraph 157).

So far, nothing new.

The real novelty lies in the legal basis.

The question we have always asked ourselves is: is a legal basis required to anonymise data?

The EDPB answers with a formulation of fundamental operational importance: “a controller may rely on the same legal basis both for the anonymisation process and for the preceding processing operations, where such operations are part of a set of operations carried out for the same scientific research purposes”.

In substance, anonymisation does not require an autonomous legal basis if it is an integral part of the research process (a rule consistent with the CJEU judgment in Case C-77/21).

The same holds for pseudonymisation.

Here too, the EDPB rule is clear-cut: “where a controller processes personal data for scientific research purposes and applies pseudonymisation, the legal basis for the processing of the personal data extends to all the processing operations necessary to carry out the pseudonymisation transformation”.

A new element, on the other hand, is the assessment of the format of the data at the point when the controller designs the processing plan: in other words, it is established that the assessment of the format in which data are to be processed in the research “should be carried out in the context of a risk analysis or, where appropriate, a data protection impact assessment (DPIA)”.

The choice between anonymisation, pseudonymisation and directly identifying data is, therefore, not a technical detail to be deferred to the implementation phase: it is a decision that must be made and documented at the time the processing is designed.

It follows naturally that “there should be an ongoing process of verification of the effectiveness of the measures implemented to ensure anonymisation or pseudonymisation during the course of a research project, or for as long as the data are otherwise retained, in particular where datasets are combined, which may increase the risk of re-identification”

5.1. Coexistence of different formats and contractual obligations of non-re-identification

Processing operations for research purposes may be carried out on personal data in different formats, and it is in this context that contracts between the parties are valorised as organisational measures.

The controller receiving the data must, where it has no need to identify the data subjects, receive anonymous data. The controller supplying the data may, in turn, retain a pseudonymised or directly identifiable copy for its own purposes — the reproducibility of research results, the review of the data. The model is linear: each actor holds the minimum format necessary for its own purposes.

In relation to this (widespread) scenario, paragraph 163 completes the picture at the level of legal and contractual obligations.

Contractual prohibitions on re-identification — to be included in every data-transfer agreement — “may, as organisational measures, complement technical anonymisation or pseudonymisation, in particular where datasets are shared between research partners or other recipients”. Contractual obligations must, moreover, ensure that, following a request from the data subject exercising their rights, the controller which has pseudonymised the data is able to provide the information necessary to re-identify them to the controllers or processors processing the pseudonymised data for research purposes, or — as an alternative — to ensure that the original controller enables the practical exercise of those rights. 

5.2. Transparency on format: a correction to Italian practice

Paragraph 164 closes the section with an indication intended to correct a very widespread Italian practice. Data subjects must be informed, “to what extent their personal data are processed for research purposes in a form that allows identification and, where the personal data are made available to other recipients, to whom”. And, in particular, “data subjects should not be given the impression that their personal data will be anonymised where, in reality, the data will be processed in pseudonymised form, and the data subjects will still be identifiable, albeit indirectly”.

In substance, it must be assessed at the outset of the research whether the data will be anonymised or pseudonymised, and what actually happens must be clearly communicated to the data subject.

A final caveat, again from paragraph 164: data subjects must be informed before processing that it will not be possible to retrieve their data from an aggregated set or from the research results after anonymisation. After anonymisation, in other words, they will no longer be able to receive individual information on the processing of their data — for example, the results of clinical tests — since these will no longer constitute personal data: this does not affect the possibility of receiving general and aggregate information on the development of the projects or on their results. 

  1. Conclusions

EDPB Guidelines 1/2026 are open for public consultation until 25 June.

However, the overall framework already appears much clearer.

The qualification of activities that may be regarded as scientific research, the legitimation of private entities, broad consent, the distinction between compatibility and lawfulness of further processing, the regime of databases, and the hierarchy between anonymisation, pseudonymisation and identifying data are all topics that have long been sources of doubts and conflicting interpretations.

At a systemic level, the document signals a clear direction: scientific research is a social good that deserves a unified regime, regardless of whether the entity carrying it out is public or private, or whether the project is commercial or non-commercial. This is a direction with which the Italian Data Protection Authority (Garante) will, sooner or later, have to align its own internal choices.