Stanislas Girard

Chatbots are going to destroy infrastructures and your cloud bills

That simple AI feature is secretly a costly monolith. Learn how to separate fast and slow tasks before your cloud bill explodes.

Chatbots are going to destroy infrastructures and your cloud bills
#1about 3 minutes

Comparing web developers and data scientists before GenAI

Before generative AI, web developers focused on CPU-bound tasks and horizontal scaling while data scientists worked with GPU-bound tasks and vast resources.

#2about 3 minutes

The new AI engineer role and the RAG pipeline

The emergence of the AI engineer role combines web development and data science skills, often applied to building RAG pipelines for data ingestion and querying.

#3about 2 minutes

Key architectural challenges in building GenAI apps

Generative AI applications face unique architectural problems, including long response times, sequential bottlenecks, and the difficulty of mixing CPU and GPU-bound processes.

#4about 3 minutes

How a simple chatbot evolves into a large monolith

Adding features like document ingestion and web scraping to a simple chatbot can rapidly increase its resource consumption and Docker image size, creating a complex monolith.

#5about 4 minutes

Refactoring a monolithic AI app into a service architecture

To manage complexity and cost, a monolithic AI application should be refactored by separating user-facing logic from heavy background tasks into distinct, independently scalable services.

#6about 3 minutes

Choosing the right architecture for your application's workload

A monolithic architecture is suitable for low or continuous workloads, while a service-based approach is necessary for applications with high or spiky traffic to manage costs and scale effectively.

#7about 2 minutes

Overlooked challenges of running AI applications in production

Beyond core architecture, running AI in production involves complex challenges like managing GPUs on Kubernetes, model versioning, data compliance, and testing non-deterministic outputs.

#8about 2 minutes

Using creative evaluations and starting with small models

A creative evaluation using a game like Street Fighter reveals that smaller, faster LLMs can outperform larger ones for many use cases, making them a better starting point.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
DC
Daniel Cranney
Dev Digest 205: AI vs. OSS, Hidden ChatGPT Features, Linux in a PDF
Inside last week’s Dev Digest 205 . 😔 The end of the curl bug bounty 📝 Agent Skills vs. Rules vs. Commands 💬 The best hidden ChatGPT features 📅 Weaponising calendar invites 🟪 CSS in 2026 🐍 Python numbers you should know 👨‍💻 The Github Copilot SDK 💻 ...
Dev Digest 205: AI vs. OSS, Hidden ChatGPT Features, Linux in a PDF
DC
Daniel Cranney
Stephan Gillich - Bringing AI Everywhere
In the ever-evolving world of technology, AI continues to be the frontier for innovation and transformation. Stephan Gillich, from the AI Center of Excellence at Intel, dove into the subject in a recent session titled "Bringing AI Everywhere," sheddi...
Stephan Gillich - Bringing AI Everywhere

From learning to earning

Jobs that call for the skills explored in this talk.

Solution Engineer

Generative Ai
Charing Cross, United Kingdom

GIT
REST
Azure
React
Angular
+3