Implementing continuous delivery in a data processing pipeline

What if you could roll back a data pipeline instantly, with zero state migration? See how treating data as an immutable, versioned artifact makes this a reality.

#1about 4 minutes

From research concepts to production-ready data products

The Volkswagen Data Lab shifted its focus from demonstrating proof-of-concepts to building and deploying real-world data solutions for its clients.

#2about 7 minutes

Core concepts of continuous delivery for data

Continuous delivery for data pipelines requires adapting standard CI/CD principles, where data is the deliverable, by progressing through version control, integration, and deployment stages.

#3about 11 minutes

Implementing a pipeline with immutable, versioned data

The five-step pipeline relies on treating data as immutable, creating a new versioned output for each run to enable simple rollbacks and reproducibility.

#4about 6 minutes

The challenge of orchestrating chained data jobs

Managing dependencies between jobs becomes complex when each job consumes versioned, immutable data inputs from upstream processes.

#5about 5 minutes

Pros and cons of the immutable data approach

While this method offers powerful benefits like reproducibility and instant rollbacks, it introduces challenges in orchestration complexity and increased storage costs.