Sergio Perez & Harshita Seth

Aug 20, 2025 • World Congress 2025

Adding knowledge to open-source LLMs

When is retrieval-augmented generation not enough? Learn the multi-stage process for deeply embedding new knowledge into an open-source LLM.

#1about 4 minutes

Understanding the LLM training pipeline and knowledge gaps

LLMs are trained through pre-training and alignment, but require new knowledge to stay current, adapt to specific domains, and acquire new skills.

#2about 5 minutes

Adding domain knowledge with continued pre-training

Continued pre-training adapts a foundation model to a specific domain by training it further on specialized, unlabeled data using self-supervised learning.

#3about 6 minutes

Developing skills and reasoning with supervised fine-tuning

Supervised fine-tuning uses instruction-based datasets to teach models specific tasks, chat capabilities, and complex reasoning through techniques like chain of thought.

#4about 8 minutes

Aligning models with human preferences using reinforcement learning

Preference alignment refines model behavior using reinforcement learning, evolving from complex RLHF with reward models to simpler methods like DPO.

#5about 2 minutes

Using frameworks like NeMo RL to simplify model alignment

Frameworks like the open-source NeMo RL abstract away the complexity of implementing advanced alignment algorithms like reinforcement learning.

Sunhat
Köln, Germany

Remote

€65-95K

Senior

TypeScript

REST

+1

Almedia
Berlin, Germany

€80-200K

Senior

Python

Structured Query Language (SQL)

SD Worx
Antwerp, Belgium

Intermediate

.NET

C#

+2

Understanding the core capabilities of large language models

02:26 MIN

Understanding the core capabilities of large language models

Data Privacy in LLMs: Challenges and Best Practices

Introducing InstructLab for accessible LLM fine-tuning

01:12 MIN

Introducing InstructLab for accessible LLM fine-tuning

Unlocking the Power of AI: Accessible Language Model Tuning for All

How large language models are trained

08:05 MIN

How large language models are trained

Inside the Mind of an LLM

Addressing the core challenges of large language models

05:18 MIN

Addressing the core challenges of large language models

Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps

Addressing the key challenges of large language models

02:55 MIN

Addressing the key challenges of large language models

Large Language Models ❤️ Knowledge Graphs

Using large language models as a learning tool

03:42 MIN

Using large language models as a learning tool

Google Gemini: Open Source and Deep Thinking Models - Sam Witteveen

The training process of large language models

02:21 MIN

The training process of large language models

Google Gemini: Open Source and Deep Thinking Models - Sam Witteveen

Overcoming the problem of stale knowledge in LLMs

03:20 MIN

Overcoming the problem of stale knowledge in LLMs

Engineering Mindset in the Age of AI - Gunnar Grosch, AWS

Featured Partners

Inside the Mind of an LLM

Inside the Mind of an LLM

Emanuele Fabbiani

about 8 months ago • World Congress 2025

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

Anshul Jindal

about 8 months ago • World Congress 2025

Unlocking the Power of AI: Accessible Language Model Tuning for All

Unlocking the Power of AI: Accessible Language Model Tuning for All

Cedric Clyburn & Legare Kerrison

about 2 years ago • World Congress 2024

Self-Hosted LLMs: From Zero to Inference

Self-Hosted LLMs: From Zero to Inference

Roberto Carratalá & Cedric Clyburn

about 8 months ago • World Congress 2025

Exploring LLMs across clouds

Exploring LLMs across clouds

Tomislav Tipurić

about 8 months ago • World Congress 2025

Large Language Models ❤️ Knowledge Graphs

Large Language Models ❤️ Knowledge Graphs

Michael Hunger

about 2 years ago • World Congress 2024

Creating Industry ready solutions with LLM Models

Creating Industry ready solutions with LLM Models

Vijay Krishan Gupta & Gauravdeep Singh Lotey

about 2 years ago • WeAreDevelopers LIVE

The State of GenAI & Machine Learning in 2025

The State of GenAI & Machine Learning in 2025

Alejandro Saucedo

about 8 months ago • World Congress 2025

Related Articles

View all articles

Luis Minvielle

What Are Large Language Models?

Developers and writers can finally agree on one thing: Large Language Models, the subset of AIs that drive ChatGPT and its competitors, are stunning tech creations. Developers enjoying the likes of GitHub Copilot know the feeling: this new kind of te...

What Are Large Language Models?

Benedikt Bischof

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Bas Geerdink who gave advice on MLOps.‍About the speaker:‍Bas is a programmer, scientist, and IT manager. At ING, he is responsible for the Fast...

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

Krissy Davis

The Best Large Language Models on The Market

Large language models are sophisticated programs that enable machines to comprehend and generate human-like text. They have been the foundation of natural language processing for almost a decade. Although generative AI has only recently gained popula...

The Best Large Language Models on The Market

Daniel Cranney

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

This week, we’re continuing our look-back on some of the best moments from the Weekly Developer Show from 2025. Here’s what some of our fantastic guests had to say… Sebastian Gingter cracked open the idea of “slopsquatting” and explained why we shou...

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

From learning to earning

Jobs that call for the skills explored in this talk.

Generative AI Engineer (LLM & AI Solutions)

Collaboration Betters The World GmbH

API

Azure

Flask

Python

FastAPI

+2

Data Scientist - AI - LLM Solutions

ERNI Spain

€72K

Amazon Web Services (AWS)

Natural Language Processing

Deep Learning Architect, LLM Inference - New College Grad 2026

NVIDIA Ltd.
Santa Clara, United States of America

$124-195K

Junior

PyTorch

AI Researcher

LLMS, LLC

Remote

Azure

Python

PyTorch

TensorFlow

+3

AI Engineer (LLMs + Knowledge Graphs)

Jobriver Hr Service
Berlin, Germany

€60-85K

Azure

Python

Docker

FastAPI

+2

ML Engineer (LLM Systems)

CYNNOVATIVE, LLC

Senior

API

GIT

Azure

Python

Docker

+6

Research Engineer - Deep Learning Models for Speech

Barcelona Supercomputing Center
Barcelona, Spain

Intermediate

GIT

Linux

Python

Machine Learning

Speech Recognition

Machine Learning Engineer (MLOps focus)

LUMC
Leiden, Netherlands

Remote

€3-5K

Intermediate

Linux

Python

Docker

+4

Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA

Enigma LLC
Campbell, United States of America

Intermediate

NoSQL

Azure

Python

PyTorch

TensorFlow

+4