Masterclass

Big Data and AI Architecture: Apache Iceberg, Spark and LLMs

read description ↓

This presentation delves into the potential of integrating LLMs with Apache Spark and Apache Iceberg as part of a Big Data to AI foundational architecture. In this session we’ll explore the potential of combining Iceberg, Spark and LLMs to give you a real world AI architecture that uses your data.

We'll build an AI application that allows users to perform data queries and extract insights from massive datasets using natural language. We'll start with understanding the structure and architecture of a large dataset. Then we'll look at options for querying the dataset using Apache Spark and Trino. Finally, we'll use an LLM to query the dataset using natural language. We'll also look at other uses of LLMs as part of an overall solution, and explore the differences between different LLMs.

We’ll also discuss where event streaming (Kafka and Flink) fit into this architecture. The design of this architecture is meant to be flexible and give your dev team the ability to choose different technologies for the processing and querying. I’ll leave you with a CONCRETE example that you can run on your laptop and explore the possibilities. Again, this will be an example of a real-world application; the dataset used will be for home sales data for the last 15 years.

We will use these technologies:

* Apache Iceberg

* Apache Spark

* Spring AI

* Ollama

* Various LLMs

For Software Developers, Software Architects, and Tech Leads

8 July 2026, Berlin

Full-day masterclass. Only 30 spots.

Speaker

Pratik Patel

Code Hacker

Learn MORE ↓

Pratik Patel is a Java Champion, developer advocate at Azul Systems and has written 3 books on programming (Java, Cloud and OSS). An all around software and hardware nerd with experience in the healthcare, telecom, financial services, and startup sectors. He's also a co-organizer of the Atlanta Java User Group and North Atlanta JavaScript meetup, conference chairperson for Devnexus, frequent speaker at tech events, and master builder of nachos.

Access to Masterclass
Full-Day Masterclass Pass • 8 July 2026
Tech Expo - Full Access
40k sqm Full Experience • 9-10 July 2026
Workshops
Pre-registration required • 8 July 2026
Official Congress Party
Official Congress App
Certificate of Participation
Recordings
Fast Lane
Plus Lounge
Exclusive area for networking, lunch, snacks and refreshments
Speakers Lounge
VIP Lounge
Networking for executives & decision-makers
Tech Leaders Night • 8 July 2026
Evening event for executives & special guests
VIP Badge

Masterclass Pass

Now only
€ 379
Single Ticket
Regular price: € 699
Whats included?

Congress Pass & Masterclass Pass

Now only
€ 699
Single Ticket
Regular price: € 1,199
Whats included?

Check out other masterclasses

Advanced AI Systems with MCP, Memory & Human-in-the-Loop

Hosted by:

Sebastian Gingter

Christian Weyer

Learn More

The Software Engineer 2030: From Coder to AI Orchestrator?

Hosted by:

Patrick Schnell

Learn More

Mastering Software Architecture

Hosted by:

David Tielke

Learn More

Big Data and AI Architecture: Apache Iceberg, Spark and LLMs

Hosted by:

Pratik Patel

Learn More

Cross-Framework Frontend Performance Bootcamp

Hosted by:

Peter Kröner

Learn More

Spec First Development: Building and Modernizing Apps with Agentic AI

Hosted by:

Julia Kordick

Learn More

Mastering Modern Architecture: Building Flexible, Distributed Systems with Hands-On Code

Hosted by:

Oliver Sturm

Learn More

Deep Dive Workshop: AI for Enterprise Developers

Hosted by:

Dr. Damir Dobric

Learn More

Let the spec speak: Building intelligent tests with Gherkin and Playwright

Hosted by:

Elio Struyf

Luise Freese

Learn More

Building Infrastructure Tools with Kubernetes Operators and Go

Hosted by:

Rabieh Fashwall

Learn More

Observability Masterclass with OpenTelemetry: Designing, Implementing & Debugging Production Systems

Hosted by:

Shramish Kafle

Learn More

Cloud-Native Testing: A Hands-On Masterclass for Modern Infrastructure

Hosted by:

Moataz Nabil

Learn More

Event-Driven Microservices: Patterns and Practices for Production-Ready Systems

Hosted by:

Lutz Huehnken

Learn More

From Chaos to Blueprint: Rapid Architecture for Greenfield & Legacy Systems

Hosted by:

Hendrik Lösch

Learn More

Designing architecture and code that’s easy to change and test

Hosted by:

Dennis Doomen

Learn More

Modern Angular Architectures: SignalStore, Signal Forms, and Agentic UI

Hosted by:

Manfred Steyer

Learn More

The Cake Is a Lie: Fixing (Login) Accessibility

Hosted by:

Ramona Schwering

Learn More

Injection Inspection: Defending Against Data Manipulation Attacks

Hosted by:

Wekoslav Stefanovski

Bozidar Spirovski

Learn More

Taming Hallucinations in Production: Hands-On Masterclass on Agents and RAG Systems

Hosted by:

Miriam Kümmel

Learn More

GitHub Copilot Masterclass: From Autocomplete to Virtual Agents

Hosted by:

Marc Müller

Neno Loje

Learn More

Can’t find a specific topic you would love to see as a Masterclass? Reach out to us at tickets@wearedevelopers.com