Skip to content

Data & Databases

Swapping a Data Warehouse at Runtime: Zero-Downtime Migration Without Changing a Single Client

with Michael O'Toole & Max Fischer

Friday 10 July 16:20 – 16:50 Stage 10 - powered by TikTok

About This Session

Trade Republic serves 10 million with €150 billion under management. Our data warehouse handles 4 million queries daily across analytics, product features, and the ML fraud detection that protects our customers. It replicates 220 databases into a 620 TB lakehouse. It cannot go down. Moving to an open lakehouse architecture — Apache Iceberg & bring-your-own-compute — "schedule a maintenance window" was not an option. Neither was asking hundreds of consumers — BI tools, pipelines, ML models, product services — to rewrite their connections. The destination: decoupled storage-compute where teams choose the engine that fits their workload. Spark, Athena, DuckDB — all reading from the Iceberg. But the migration path matters as much as the destination. Our approach: "Engy" build a protocol-compatible proxy that presents the exact wire interface of our existing warehouse. Every client connects the same way it always has. Behind that stable interface, we're free to change everything: swap compute engines, cache, add features — all invisible to consumers. The key enabler is in-flight SQL transpilation. The proxy rewrites SQL, translates table references between catalogs, and normalises result, all in the request path. This gives us a multi-engine architecture with perfect interop. Teams onboard without changing a driver, a connection string, or a line of code. We ship new engines and features behind the interface while production traffic is flowing — building the plane as we fly it. The interface becomes a contract that decouples the pace of infrastructure evolution from the pace of consumer adoption. In this talk I'll walk through how we designed the Engy proxy interface for long-term stability that lets us migrate a 620 TB system query-by-query without downtime. True to Trade Republic's engineering philosophy, the entire stack is built on open-source foundations — no vendor tooling, no proprietary middleware.

Topics

  • Apache Iceberg
  • Data Lakes
  • DuckDB
  • Lakehouse
  • Performance
  • Python
  • Rust
  • SQL