Sergio Perez & Harshita Seth

Aug 20, 2025 • World Congress 2025

Adding knowledge to open-source LLMs

When is retrieval-augmented generation not enough? Learn the multi-stage process for deeply embedding new knowledge into an open-source LLM.

#1about 4 minutes

Understanding the LLM training pipeline and knowledge gaps

LLMs are trained through pre-training and alignment, but require new knowledge to stay current, adapt to specific domains, and acquire new skills.

#2about 5 minutes

Adding domain knowledge with continued pre-training

Continued pre-training adapts a foundation model to a specific domain by training it further on specialized, unlabeled data using self-supervised learning.

#3about 6 minutes

Developing skills and reasoning with supervised fine-tuning

Supervised fine-tuning uses instruction-based datasets to teach models specific tasks, chat capabilities, and complex reasoning through techniques like chain of thought.

#4about 8 minutes

Aligning models with human preferences using reinforcement learning

Preference alignment refines model behavior using reinforcement learning, evolving from complex RLHF with reward models to simpler methods like DPO.

#5about 2 minutes

Using frameworks like NeMo RL to simplify model alignment

Frameworks like the open-source NeMo RL abstract away the complexity of implementing advanced alignment algorithms like reinforcement learning.

24 days ago

AI Software Engineer (m/f/d)

Sunhat
Köln, Germany

Remote

Senior

1 month ago

Senior Machine Learning Engineer (f/m/d)

MARKT-PILOT GmbH
Stuttgart, Germany

Remote

Senior

10 days ago

Lead Fullstack Engineer AI

Hubert Burda Media
München, Germany

Intermediate

How LLMs generate text and learn behavior

02:07 MIN

How LLMs generate text and learn behavior

You are not my model anymore - understanding LLM model behavior

Understanding LLMs, context windows, and RAG

00:53 MIN

Understanding LLMs, context windows, and RAG

Beyond Prompting: Building Scalable AI with Multi-Agent Systems and MCP

Understanding the fundamentals of large language models

02:12 MIN

Understanding the fundamentals of large language models

Building Blocks of RAG: From Understanding to Implementation

Why large language models need retrieval augmented generation

00:57 MIN

Why large language models need retrieval augmented generation

Build RAG from Scratch

The evolution of NLP from early models to modern LLMs

00:04 MIN

The evolution of NLP from early models to modern LLMs

Harry Potter and the Elastic Semantic Search

Defining key GenAI concepts like GPT and LLMs

23:35 MIN

Defining key GenAI concepts like GPT and LLMs

Enter the Brave New World of GenAI with Vector Search

Advanced techniques like RAG, function calling, and fine-tuning

35:15 MIN

Advanced techniques like RAG, function calling, and fine-tuning

Creating Industry ready solutions with LLM Models

Shifting from general LLMs to specialized models

30:39 MIN

Shifting from general LLMs to specialized models

ChatGPT vs Google: SEO in the Age of AI Search - Eric Enge

Featured Partners

Inside the Mind of an LLM

Inside the Mind of an LLM

Emanuele Fabbiani

about 2 months ago • World Congress 2025

Unlocking the Power of AI: Accessible Language Model Tuning for All

Unlocking the Power of AI: Accessible Language Model Tuning for All

Cedric Clyburn & Legare Kerrison

about a year ago • World Congress 2024

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

Anshul Jindal

about 2 months ago • World Congress 2025

Self-Hosted LLMs: From Zero to Inference

Self-Hosted LLMs: From Zero to Inference

Roberto Carratalá & Cedric Clyburn

about 2 months ago • World Congress 2025

Exploring LLMs across clouds

Exploring LLMs across clouds

Tomislav Tipurić

about 2 months ago • World Congress 2025

Give Your LLMs a Left Brain

Give Your LLMs a Left Brain

Stephen Chin

about a year ago • World Congress 2024

Large Language Models ❤️ Knowledge Graphs

Large Language Models ❤️ Knowledge Graphs

Michael Hunger

about a year ago • World Congress 2024

Three years of putting LLMs into Software - Lessons learned

Three years of putting LLMs into Software - Lessons learned

Simon A.T. Jiménez

about 2 months ago • World Congress 2025

From learning to earning

Jobs that call for the skills explored in this talk.

Senior Data Scientist

3 months ago

Senior Data Scientist

SMG Swiss Marketplace Group
Belgrade, Serbia

Senior

AI Engineer Knowledge Graphs & Large Language Models

today

AI Engineer Knowledge Graphs & Large Language Models

digatus it group

Remote

€62-79K

Intermediate

API

ETL

Java

+6

AIML -Machine Learning Research, DMLI

today

AIML -Machine Learning Research, DMLI

Apple

Python

PyTorch

TensorFlow

Machine Learning

Natural Language Processing

Machine Learning Research Engineer in Natural Language Processing and Media Mining

today

Machine Learning Research Engineer in Natural Language Processing and Media Mining

Epfl Digital Humanities Laboratory

€95K

Junior

API

Unix

NoSQL

Python

+4

ML Application Engineer (French-speaking)

today

ML Application Engineer (French-speaking)

Neural Concept

Fluid

Python

Machine Learning

ML Application Engineer (German-speaking)

today

ML Application Engineer (German-speaking)

Neural Concept

Fluid

Python

Machine Learning

ML Engineer - MLOps/Data Focus

today

ML Engineer - MLOps/Data Focus

Baunex

Remote

ETL

GIT

Java

Kafka

+8

Machine Learning Algorithm/SW Optimization Engineer

today

Machine Learning Algorithm/SW Optimization Engineer

Leuven MindGate

Python

PyTorch

TensorFlow

Machine Learning

Data Scientist- Python/MLflow-NLP/MLOps/Generative AI

today

Data Scientist- Python/MLflow-NLP/MLOps/Generative AI

ITech Consult AG

Azure

Python

PyTorch

TensorFlow

Machine Learning