Aarno Aukia

Oct 2, 2024 • WeAreDevelopers LIVE

DevOps for AI: running LLMs in production with Kubernetes and KubeFlow

Stop firefighting your MLOps. Learn to apply proven DevOps principles with Kubernetes and KubeFlow to reliably run and scale large language models in production.

#1about 3 minutes

Applying DevOps principles to machine learning operations

The maturity of software operations from reactive firefighting to automated DevOps provides a model for improving current MLOps practices.

#2about 3 minutes

Defining AI, machine learning, and generative AI

AI is a broad concept that has evolved through machine learning and deep learning to the latest trend of generative AI, which can create new content.

#3about 4 minutes

How large language models generate text with tokens

LLMs work by converting text into numerical tokens and then using a large statistical model to predict the most probable next token in a sequence.

#4about 2 minutes

Using prompt engineering to guide LLM responses

Prompt engineering involves crafting detailed instructions and providing context within a prompt to guide the LLM toward a desired and accurate answer.

#5about 2 minutes

Understanding and defending against prompt injection attacks

User-provided input can be manipulated to bypass instructions or extract sensitive information, requiring defensive measures against prompt injection.

#6about 3 minutes

Advanced techniques like RAG and model fine-tuning

Beyond basic prompts, you can use Retrieval-Augmented Generation (RAG) to add dynamic context or fine-tune a model with specific data for better performance.

#7about 5 minutes

Choosing between cloud APIs and self-hosted models

LLMs can be consumed via managed cloud APIs, which are simple but opaque, or by self-hosting open-source models for greater control and data privacy.

#8about 2 minutes

Streamlining local development with the Ollama tool

Ollama simplifies running open-source LLMs on a local machine for development by managing model downloads and hardware acceleration, acting like Docker for LLMs.

#9about 6 minutes

Running LLMs in production with Kubeflow and KServe

Kubeflow and its component KServe provide a robust, Kubernetes-native framework for deploying, scaling, and managing LLMs in a production environment.

#10about 2 minutes

Monitoring LLM performance with KServe's observability tools

KServe integrates with tools like Prometheus and Grafana to provide detailed metrics and dashboards for monitoring LLM response times and resource usage.

17 days ago

Senior Machine Learning Engineer (f/m/d)

MARKT-PILOT GmbH
Stuttgart, Germany

Remote

Senior

5 days ago

AI Software Engineer (m/f/d)

Sunhat
Köln, Germany

Remote

Senior

11 days ago

Senior AI Software Developer & Mentor

Dynatrace
Linz, Austria

Senior

Featured Partners

The state of MLOps - machine learning in production at enterprise scale

The state of MLOps - machine learning in production at enterprise scale

Bas Geerdink

about 4 years ago • WeAreDevelopers LIVE

DevOps for Machine Learning

DevOps for Machine Learning

Hauke Brammer

about 4 years ago • World Congress 2021

Creating Industry ready solutions with LLM Models

Creating Industry ready solutions with LLM Models

Vijay Krishan Gupta & Gauravdeep Singh Lotey

about a year ago • WeAreDevelopers LIVE

From Traction to Production: Maturing your LLMOps step by step

From Traction to Production: Maturing your LLMOps step by step

Maxim Salnikov

about 10 months ago • WeAreDevelopers LIVE

Effective Machine Learning - Managing Complexity with MLOps

Effective Machine Learning - Managing Complexity with MLOps

Simon Stiebellehner

about 4 years ago • World Congress 2021

Multilingual NLP pipeline up and running from scratch

Multilingual NLP pipeline up and running from scratch

Kateryna Hrytsaienko

about a year ago • WeAreDevelopers LIVE

Data Privacy in LLMs: Challenges and Best Practices

Data Privacy in LLMs: Challenges and Best Practices

Aditi Godbole

about a year ago • WeAreDevelopers LIVE

How to Avoid LLM Pitfalls - Mete Atamel and Guillaume Laforge

How to Avoid LLM Pitfalls - Mete Atamel and Guillaume Laforge

Meta Atamel & Guillaume Laforge

about 6 months ago • Coffee With Developers

From learning to earning

Jobs that call for the skills explored in this talk.

8 days ago

Machine Learning (IA/ML) + DevOps (MLOps)

Alten
Municipality of Madrid, Spain

Remote

Java

DevOps

Python

Kubernetes

+3

8 days ago

ML/DevOps Engineer at dynamic AI/ Computer Vision company

Nomitri
Berlin, Germany

C++

Bash

Azure

DevOps

Python

+12

2 days ago

Machine Learning Engineer, MLOps/GenAI, Engine AI Center of Excellence (AICE)

Amazon.com, Inc
Berlin, Germany

Machine Learning

Natural Language Processing

8 days ago

AI DevOps Lead

OMNIOS
Barcelona, Spain

Remote

API

GIT

JIRA

React

+7

3 days ago

MLOps / DevOps Engineer

Municipality of Madrid, Spain

€40-60K

Azure

DevOps

Python

Docker

+5

8 days ago

AI Engineer / Machine Learning Engineer / KI-Entwickler (m/w/d) - Schwerpunkt Cloud & MLOps

Agenda GmbH
Rosenheim, Germany

Intermediate

API

Azure

Python

Docker

PyTorch

+9

8 days ago

Machine Learning Engineers

Ai-powered

Computer Vision

Machine Learning

8 days ago

Machine Learning Engineer (MLOps)

European Dynamics
Brussels, Belgium

ELK

Bash

Unix

Hive

Azure

+19

4 days ago

AI/ML Team Lead - Generative AI (LLMs, AWS)

Provectus
Canton de Saint-Mihiel, France

Remote

€96K

Senior

Python

PyTorch

TensorFlow

+4