Álvaro Martín Lozano

Implementing continuous delivery in a data processing pipeline

What if you could roll back a data pipeline instantly, with zero state migration? See how treating data as an immutable, versioned artifact makes this a reality.

Implementing continuous delivery in a data processing pipeline
#1about 4 minutes

From research concepts to production-ready data products

The Volkswagen Data Lab shifted its focus from demonstrating proof-of-concepts to building and deploying real-world data solutions for its clients.

#2about 7 minutes

Core concepts of continuous delivery for data

Continuous delivery for data pipelines requires adapting standard CI/CD principles, where data is the deliverable, by progressing through version control, integration, and deployment stages.

#3about 11 minutes

Implementing a pipeline with immutable, versioned data

The five-step pipeline relies on treating data as immutable, creating a new versioned output for each run to enable simple rollbacks and reproducibility.

#4about 6 minutes

The challenge of orchestrating chained data jobs

Managing dependencies between jobs becomes complex when each job consumes versioned, immutable data inputs from upstream processes.

#5about 5 minutes

Pros and cons of the immutable data approach

While this method offers powerful benefits like reproducibility and instant rollbacks, it introduces challenges in orchestration complexity and increased storage costs.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

From learning to earning

Jobs that call for the skills explored in this talk.