Fabian Pottbäcker, Thomas Endres & Martin Foertsch
AI'll Be Back: Generative AI in Image, Video, and Audio Production
#1about 2 minutes
The hype and promise of generative AI
Generative AI is at the peak of the Gartner Hype Cycle, with applications spanning text, image, audio, and video generation.
#2about 1 minute
How large language models generate text
Large language models (LLMs) function as next-word predictors, generating text token by token in a process that creates a typewriter-like effect.
#3about 3 minutes
Understanding tokenization and semantic embeddings
Text is broken down into numerical tokens and then mapped into a multi-dimensional vector space where semantically similar words are located close together.
#4about 3 minutes
The role of transformers and the attention mechanism
The transformer architecture uses an attention mechanism to weigh the importance of different words in the input sequence to understand context and resolve ambiguity.
#5about 2 minutes
Connecting text and images with the CLIP model
The CLIP model establishes a shared embedding space for text and images, enabling the system to measure the semantic similarity between a text description and a picture.
#6about 7 minutes
How diffusion models create images from noise
Diffusion models generate images through an iterative process of predicting and subtracting noise from a random starting point, guided by a text prompt's embedding.
#7about 5 minutes
Applying diffusion transformers to video generation
Video generation uses a diffusion transformer to maintain coherence across frames by processing video in patches and applying the denoising process to the entire sequence.
#8about 1 minute
Advanced techniques for video manipulation and editing
Beyond simple generation, models can perform image-to-video conversion, extend existing clips, interpolate between two different videos, or edit specific regions.
#9about 2 minutes
Current limitations and physical inconsistencies in AI video
Generative video models still struggle with understanding cause and effect, leading to physically impossible events and objects appearing or behaving illogically.
#10about 3 minutes
Ethical challenges of generative AI training data
Major ethical concerns include the use of copyrighted or publicly available data without consent for training models, leading to legal challenges and questions about ownership.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
Sunhat
Köln, Germany
Remote
€85-115K
Senior
Team Leadership
Software Architecture
+1
Matching moments
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
01:32 MIN
Organizing a developer conference for 15,000 attendees
Cat Herding with Lions and Tigers - Christian Heilmann
03:28 MIN
Why corporate AI adoption lags behind the hype
What 2025 Taught Us: A Year-End Special with Hung Lee
03:15 MIN
The future of recruiting beyond talent acquisition
What 2025 Taught Us: A Year-End Special with Hung Lee
14:06 MIN
Exploring the role and ethics of AI in gaming
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
03:48 MIN
Automating formal processes risks losing informal human value
What 2025 Taught Us: A Year-End Special with Hung Lee
02:44 MIN
Rapid-fire thoughts on the future of work
What 2025 Taught Us: A Year-End Special with Hung Lee
04:27 MIN
Moving beyond headcount to solve business problems
What 2025 Taught Us: A Year-End Special with Hung Lee
Featured Partners
Related Videos
Your imaginations is (no longer) the limit: how Generative AI empowers people to be creative
David Estevez
Multimodal Generative AI Demystified
Ekaterina Sirazitdinova
AI: Superhero or Supervillain? How and Why with Scott Hanselman
Scott Hanselman
In the Dawn of the AI: Understanding and implementing AI-generated images
Timo Zander
GenAI Unpacked: Beyond Basic
Damir
The shadows of reasoning – new design paradigms for a gen AI world
Jonas Andrulis
Should we build Generative AI into our existing software?
Simon Müller
The AI Elections: How Technology Could Shape Public Sentiment
Martin Förtsch & Thomas Endres
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

UL Solutions
Barcelona, Spain
Python
Machine Learning

University of the Arts, London
Sleaford, United Kingdom
£34-41K
Python
PyTorch
TensorFlow


Descripción De La Vacante
€40-70K
Azure
Python
PyTorch
TensorFlow
+1


univativ GmbH & Co. KG
Stuttgart, Germany
€88-98K
JIRA
Azure
Scrum
Confluence
+4

Generative Ai Engineer83zero Limited
Glasgow, United Kingdom
£80-88K
GIT
Azure
NoSQL
React
+16

Accenture
Charing Cross, United Kingdom
REST
React
GraphQL
React Native
Continuous Integration