DORA Synthetic Data Requirements for Resilience Testing

I watched a team at a mid-size European bank prepare for their first ICT resilience test under DORA. They had the threat scenarios mapped, the recovery procedures documented, the incident response team briefed. Then someone asked: what data are we testing with? The room went quiet.

DORA regulatory timeline showing January 2025 enforcement with three pillars

They had been planning to use masked production data — the same approach that got them through every previous audit. But DORA is not a previous audit. The Digital Operational Resilience Act, fully in force across the EU since January 17, 2025, explicitly references the use of DORA synthetic data as part of the resilience testing framework. And the regulation does not treat it as optional.

Key Takeaway: DORA Articles 24 and 25 establish ICT resilience testing requirements for all EU financial entities. The regulation explicitly mentions synthetic data for threat-led penetration testing (TLPT). Born-Synthetic data — generated from mathematical distributions with zero real PII — is the most direct path to compliance.

What Does DORA Actually Require for Resilience Testing?

DORA applies to virtually every entity in the EU financial ecosystem: credit institutions, payment processors, investment firms, insurance companies, crypto-asset service providers, and their critical ICT third-party providers. The scope is deliberately broad — the regulation was written after watching single points of ICT failure cascade across entire financial markets.

Articles 24 and 25 are the resilience testing core. Article 24 establishes the general requirement: financial entities must maintain a comprehensive ICT testing programme that includes vulnerability assessments, network security testing, scenario-based testing, and performance testing. This is not a one-off exercise — it is continuous and risk-proportionate.

Article 25 raises the bar for significant financial entities. These institutions must conduct advanced testing through Threat-Led Penetration Testing (TLPT) at least every three years. The TLPT framework, built on the TIBER-EU model, requires realistic threat scenarios that simulate actual adversary behaviour against the entity’s critical functions.

Here is where synthetic data becomes unavoidable: TLPT scenarios must be realistic. You cannot simulate a sophisticated attack against your KYC onboarding pipeline using 10,000 records of “John Smith, 123 Main Street.” The test is only as valid as the data feeding it. But you also cannot use production customer data in a penetration testing environment — the operational risk and data protection implications are unacceptable.

Three-column comparison of test data approaches for DORA resilience testing

Why Real Data Fails the DORA Test

I have seen three approaches that financial institutions default to when they need test data for resilience exercises. All three have problems that DORA makes worse.

Copied production data. The fastest approach and the most dangerous. You get realistic complexity but you introduce real PII into a testing environment that, by design, is being subjected to simulated attacks. If the test succeeds in exposing a vulnerability — which is the entire point — you have just exposed real customer data through a controlled breach. Under DORA’s incident reporting requirements (Articles 17-23), this could trigger a major ICT-related incident notification to your competent authority.

Anonymized production data. Better than raw copies but still structurally flawed. Anonymization preserves statistical properties at the cost of re-identification risk — and for UHNWI profiles with distinctive quasi-identifiers, the risk is not theoretical. A profile showing “$87M net worth, Luxembourg residence, family office structure, BVI trust” points to a very small set of real people whether or not the name field is blanked. For the full breakdown of why anonymization fails for financial data, see 5 Re-Identification Attacks Anonymized Financial Data Cannot Survive.

Generic synthetic data. Safe from a privacy perspective but worthless for resilience testing. If your synthetic profiles are flat — uniform wealth distributions, no offshore structures, no cultural diversity, no PEP indicators — your resilience test will pass scenarios that would fail against real-world complexity. DORA requires realistic testing, not theatrical testing.

Venn diagram showing GDPR, EU AI Act, and DORA overlapping with born-synthetic data in center

How Born-Synthetic Data Solves the DORA Resilience Problem

Born-synthetic data is generated from mathematical foundations — Pareto distributions for wealth, algebraic constraints ensuring assets minus liabilities equals net worth, geographic models for jurisdictions and cultural context. No real person is used as input at any stage.

This matters for DORA in three specific ways.

Realistic complexity without operational risk. Each born-synthetic profile carries 29 interlocked fields: net worth, total assets, total liabilities, property value, core equity, cash liquidity, offshore jurisdiction, offshore vehicle, PEP status, risk rating, sanctions screening results, adverse media flags, and more. These are not placeholder values — they reflect the structural complexity of real UHNWI portfolios across 6 geographic niches and 31 wealth archetypes. When your pen-test team launches a simulated attack against your KYC screening pipeline, the data behaves like production data without being production data.

Zero PII in the testing environment. DORA’s resilience testing framework assumes that test environments will be subjected to deliberate stress, including simulated breaches. If a test succeeds in finding a vulnerability, the data exposed is synthetic — there is no incident to report, no customer to notify, no regulator to inform. The test did exactly what it was supposed to do.

Documented provenance. Every dataset ships with a Certificate of Sovereign Origin documenting the generation method, the distribution parameters, and the absence of real-world data lineage. When the DORA supervisory authority reviews your testing programme, the provenance of your test data is documented before they ask.

DORA Meets GDPR and the EU AI Act — The Regulatory Trinity

DORA does not exist in isolation. It joins GDPR and the EU AI Act to form what I call the regulatory trinity for financial data.

GDPR Article 25 requires data protection by design — including in test environments. If your resilience test data contains real PII, you need a lawful basis for that processing, purpose limitation, and security measures equivalent to production. See Why GDPR Article 25 Bans Real Data in Test Environments for the full analysis.

EU AI Act Article 10 requires governance of training and testing data for high-risk AI systems — which includes most financial services AI. The data must be representative, relevant, and compliant with applicable data protection law. Enforcement begins August 2026.

DORA Articles 24-25 require realistic resilience testing with data that does not compromise operational security.

Born-synthetic data satisfies all three simultaneously. It is realistic enough for AI training and resilience testing, contains zero PII for GDPR compliance, and has documented provenance for regulatory review.

Take the GDPR Risk Assessment to see where your current test data practices stand against these overlapping requirements.

FAQ: DORA and Synthetic Data

Who does DORA apply to?

Virtually all EU-regulated financial entities: banks, insurers, investment firms, payment institutions, e-money institutions, crypto-asset service providers, central securities depositories, trade repositories, and their critical ICT third-party service providers. The regulation entered full application on January 17, 2025.

Does DORA explicitly mention synthetic data?

Yes. The DORA framework for Threat-Led Penetration Testing (TLPT), based on TIBER-EU, references the use of synthetic data as part of realistic threat scenario construction. Financial entities conducting TLPT must use data that realistically simulates production conditions without introducing operational risk.

Can I use anonymized production data for DORA resilience testing?

Technically possible, but it introduces re-identification risk — especially for high-net-worth profiles with distinctive quasi-identifiers. If a simulated breach exposes anonymized data that can be re-identified, you may trigger both DORA incident reporting and GDPR breach notification obligations. Born-synthetic eliminates this risk entirely.

How does DORA interact with GDPR for test data?

They compound each other. GDPR requires data protection by design in all processing environments including testing. DORA requires realistic resilience testing. Using real data creates a conflict between realism and protection. Born-synthetic data resolves the conflict by providing realism without real PII.

What is the penalty for non-compliance with DORA?

DORA empowers competent authorities to require financial entities to cease activities, impose periodic penalty payments, and issue public notices. The European Supervisory Authorities (EBA, EIOPA, ESMA) can also impose penalties on critical ICT third-party providers. Unlike GDPR’s percentage-of-revenue fines, DORA’s enforcement operates through supervisory measures that can restrict a firm’s operational activities.

The Compliance Window Is Now

DORA is not coming — it is here. Financial entities across the EU are conducting their first cycle of resilience tests under the new framework. The ones using born-synthetic data are testing realistically without exposing a single real customer. The ones using copied production data are hoping the regulator does not ask where the test data came from.

I built KYC-enhanced profiles with 29 interlocked fields — PEP status, risk ratings, sanctions screening, adverse media, beneficial ownership indicators — specifically for this use case. Deterministic. Reproducible. Zero AI on the compliance fields. Every dataset ships with a Certificate of Sovereign Origin.

Download 100 free KYC-Enhanced profiles and run them through your resilience testing pipeline.

Download free kyc sample →

DORA Requires Synthetic Data for Resilience Testing — Here’s What That Means

What Does DORA Actually Require for Resilience Testing?

Why Real Data Fails the DORA Test

How Born-Synthetic Data Solves the DORA Resilience Problem

DORA Meets GDPR and the EU AI Act — The Regulatory Trinity

FAQ: DORA and Synthetic Data

The Compliance Window Is Now

Leave a Comment Cancel Reply

What Does DORA Actually Require for Resilience Testing?

Why Real Data Fails the DORA Test

How Born-Synthetic Data Solves the DORA Resilience Problem

DORA Meets GDPR and the EU AI Act — The Regulatory Trinity

FAQ: DORA and Synthetic Data

The Compliance Window Is Now

Related Posts

Leave a Comment Cancel Reply