Roberto Carratalá & Cedric Clyburn

Self-Hosted LLMs: From Zero to Inference

Stop sending your private data to third-party AI. This talk shows you how to self-host powerful language models for complete control and security.

Self-Hosted LLMs: From Zero to Inference
#1about 3 minutes

The rise of self-hosted open source AI models

Self-hosting large language models offers developers greater privacy, cost savings, and control compared to third-party cloud AI services.

#2about 2 minutes

Key benefits of local LLM deployment for developers

Running models locally improves the development inner loop, provides full data privacy, and allows for greater customization and control over the AI stack.

#3about 3 minutes

Comparing open source tools for serving LLMs

Explore different open source tools like Ollama for local development, vLLM for scalable production, and Podman AI Lab for containerized AI applications.

#4about 3 minutes

How to select the right open source LLM

Navigate the vast landscape of open source models by understanding different model families, their specific use cases, and naming conventions.

#5about 3 minutes

Using quantization to run large models locally

Model quantization compresses LLMs to reduce their memory footprint, enabling them to run efficiently on consumer hardware like laptops with CPUs or GPUs.

#6about 1 minute

Strategies for integrating local LLMs with your data

Learn three key methods for connecting local models to your data: Retrieval-Augmented Generation (RAG), local code assistants, and building agentic applications.

#7about 6 minutes

Demo: Building a RAG system with local models

Use Podman AI Lab to serve a local LLM and connect it to AnythingLLM to create a question-answering system over your private documents.

#8about 5 minutes

Demo: Setting up a local AI code assistant

Integrate a self-hosted LLM with the Continue VS Code extension to create a private, offline-capable AI pair programmer for code generation and analysis.

#9about 4 minutes

Demo: Building an agentic app with external tools

Create an agentic application that uses a local LLM with external tools via the Model Context Protocol (MCP) to perform complex, multi-step tasks.

#10about 1 minute

Conclusion and the future of open source AI

Self-hosting provides a powerful, private, and customizable alternative to third-party services, highlighting the growing potential of open source AI for developers.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

From learning to earning

Jobs that call for the skills explored in this talk.