Aarno Aukia
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
#1about 3 minutes
Applying DevOps principles to machine learning operations
The maturity of software operations from reactive firefighting to automated DevOps provides a model for improving current MLOps practices.
#2about 3 minutes
Defining AI, machine learning, and generative AI
AI is a broad concept that has evolved through machine learning and deep learning to the latest trend of generative AI, which can create new content.
#3about 4 minutes
How large language models generate text with tokens
LLMs work by converting text into numerical tokens and then using a large statistical model to predict the most probable next token in a sequence.
#4about 2 minutes
Using prompt engineering to guide LLM responses
Prompt engineering involves crafting detailed instructions and providing context within a prompt to guide the LLM toward a desired and accurate answer.
#5about 2 minutes
Understanding and defending against prompt injection attacks
User-provided input can be manipulated to bypass instructions or extract sensitive information, requiring defensive measures against prompt injection.
#6about 3 minutes
Advanced techniques like RAG and model fine-tuning
Beyond basic prompts, you can use Retrieval-Augmented Generation (RAG) to add dynamic context or fine-tune a model with specific data for better performance.
#7about 5 minutes
Choosing between cloud APIs and self-hosted models
LLMs can be consumed via managed cloud APIs, which are simple but opaque, or by self-hosting open-source models for greater control and data privacy.
#8about 2 minutes
Streamlining local development with the Ollama tool
Ollama simplifies running open-source LLMs on a local machine for development by managing model downloads and hardware acceleration, acting like Docker for LLMs.
#9about 6 minutes
Running LLMs in production with Kubeflow and KServe
Kubeflow and its component KServe provide a robust, Kubernetes-native framework for deploying, scaling, and managing LLMs in a production environment.
#10about 2 minutes
Monitoring LLM performance with KServe's observability tools
KServe integrates with tools like Prometheus and Grafana to provide detailed metrics and dashboards for monitoring LLM response times and resource usage.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
06:19 MIN
Defining LLMOps and understanding its core benefits
From Traction to Production: Maturing your LLMOps step by step
00:20 MIN
The lifecycle for operationalizing AI models in business
Detecting Money Laundering with AI
01:01 MIN
Understanding the role and challenges of MLOps
The Road to MLOps: How Verivox Transitioned to AWS
22:41 MIN
Introducing the Azure AI platform for end-to-end LLMOps
From Traction to Production: Maturing your LLMOps step by step
30:10 MIN
The future of AI in DevOps and MLOps
Navigating the AI Wave in DevOps
29:33 MIN
Applying software engineering discipline to AI development
Navigating the AI Revolution in Software Development
00:11 MIN
The challenge of operationalizing production machine learning systems
Model Governance and Explainable AI as tools for legal compliance and risk management
24:42 MIN
Overcoming the challenges of productionizing AI models
Navigating the AI Revolution in Software Development
Featured Partners
Related Videos
The state of MLOps - machine learning in production at enterprise scale
Bas Geerdink
From Traction to Production: Maturing your LLMOps step by step
Maxim Salnikov
LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices
Anshul Jindal
Self-Hosted LLMs: From Zero to Inference
Roberto Carratalá & Cedric Clyburn
DevOps for Machine Learning
Hauke Brammer
One AI API to Power Them All
Roberto Carratalá
Creating Industry ready solutions with LLM Models
Vijay Krishan Gupta & Gauravdeep Singh Lotey
How to Avoid LLM Pitfalls - Mete Atamel and Guillaume Laforge
Meta Atamel & Guillaume Laforge
From learning to earning
Jobs that call for the skills explored in this talk.

DevOps Engineer – Kubernetes & Cloud (m/w/d)
epostbox epb GmbH
Berlin, Germany
Intermediate
Senior
DevOps
Kubernetes
Cloud (AWS/Google/Azure)

Lead Fullstack Engineer AI
Hubert Burda Media
München, Germany
€80-95K
Intermediate
React
Python
Vue.js
Langchain
+1

Machine Learning (MLOps) Engineer
Da Vinci Engineering GmbH
Intermediate
Azure
DevOps
Python
Docker
PyTorch
+6



Machine Learning Engineer - AI-Features, Agenten & Multi Agenten Workflows
dmrz
API
.NET
DevOps
Python
Microservices
+2

Senior Systems/DevOps Developer (f/m/d)
Bonial International GmbH
Berlin, Germany
Senior
Python
Terraform
Kubernetes
Elasticsearch
Amazon Web Services (AWS)

Data Scientist- Python/MLflow-NLP/MLOps/Generative AI
ITech Consult AG
Azure
Python
PyTorch
TensorFlow
Machine Learning

AI Engineer / MLOps Engineer
VESTIGAS GmbH
Senior
Azure
Python
Machine Learning
Natural Language Processing