Synthetic data: The infrastructure for the European Health Data Space

Why synthetic data represents a technological key to enable the secondary use of clinical data in Italian healthcare and accelerate medical research today.

Regulation (EU) 2025/327 — establishing the European Health Data Space (EHDS) — has been in force since March 26, 2025. In Italy, on May 5, 2026, the EHDS National Steering Committee took office at the Ministry of Health, established by decree of Minister Schillaci on February 20, 2026. This body coordinates the Ministry of Health, AGENAS, ISS, AIFA, ISTAT, AgID, and the Regions.

The European Health Data Space: Primary and Secondary Use

The EHDS distinguishes between primary use (patient care) and secondary use (research, public health, policymaking, AI development). For secondary use, the main obligations come into effect in phases: by March 2027, Member States must designate their national Health Data Access Bodies (HDABs); by March 2029, the regulation on secondary data access becomes fully applicable, alongside the operational launch of HealthData@EU — the cross-border European infrastructure. Institutions holding data — hospitals, IRCCS (Scientific Institutes for Research, Hospitalization and Healthcare), Local Health Authorities (ASLs) — become data holders with precise obligations to respond to access requests.

The Problem: Clinical Data Locked in the System

The wealth of clinical data within the National Health Service (SSN) — electronic health records, laboratory data, diagnostic images, outcomes — is effectively unusable for research in most Italian institutions due to three structural reasons:

  • Regulatory-procedural barriers: each access request requires Ethics Committee approval, a DPIA, data sharing agreements, and potentially an opinion from the Data Protection Authority (Garante);
  • Regional fragmentation: Italian healthcare information systems are heterogeneous and not interoperable between different regions;
  • Re-identification risk: even datasets “anonymized” through traditional methods carry residual risks that block cross-institutional sharing.

The EHDS mandates solving this problem — but it does not specify how to achieve it technically. This is where synthetic data comes into play.

Synthetic Data: The EHDS-Ready Infrastructure

Synthetic data generated by AI consists of artificial data that replicates the statistical and analytical properties of a real dataset without containing information traceable to specific patients. This is not anonymized data: it is entirely new data, generated by a model trained on the original data.

The regulatory stance is clear: the European Data Protection Supervisor has confirmed that data protection principles do not apply to anonymous data — and properly generated synthetic data falls into this category. Within the EHDS ecosystem, this means:

  • It is not personal data under the GDPR → it can be shared without the same procedural barriers as real data;
  • It can be standardized into the formats required by HealthData@EU for cross-border interoperability;
  • It is analytically valid → preserving the distributions, correlations, and clinical patterns of the original dataset;
  • It is audit-ready → accompanied by validation reports covering utility, privacy, and bias.

Why Act Now, Not in 2029

The March 2029 deadline for the full implementation of the EHDS might seem far away. However, organizations building compliant data-sharing infrastructures today gain a concrete advantage in accessing European research funding: Horizon Europe and EU4Health already require compliant data-sharing capabilities.

Practical steps for hospitals and IRCCS:

  1. Map the available clinical data assets and the EHR systems currently in use.
  2. Evaluate synthetic generation for datasets intended for collaborative research or European grants.
  3. Update data governance: DPIAs, structured consent, and alignment with the EHR formats required by the EHDS.
  4. Identify the HDAB contact person as soon as the Italian Health Data Access Body is designated (by March 2027).

Aindo for Italian Healthcare

Aindo is the Italian synthetic data generation platform certified by Europrivacy (Art. 42 GDPR), ISO 27001, and ISO 9001, designed to operate within the EHDS/GDPR regulatory framework. It is already used by hospitals, health authorities, and clinical research centers to:

  • Generate synthetic cohorts for Real-World Evidence studies and clinical trials;
  • Share data between institutions without transferring personal data;
  • Accelerate data access for observational studies and epidemiological analyses.

Speak with one of our experts or explore healthcare use cases.

Join Us

Want to learn more or work with us? We'd love to hear from you.