The R in RAG: Why retrieval is often the weakest link (and how to fix it)

About This Session

RAG is one of the most popular ways to build LLM-powered applications - combining document retrieval with text generation. In demos and carefully prepared tests, these systems work great. In production, they often disappoint. Why? Because your documents are full of domain-specific terminology, internal jargon, and acronyms that off-the-shelf embedding models simply don't understand. If retrieval returns the wrong documents, even the best LLM can't save you. In this session, I'll show how to fine-tune an embedding model for your specific domain. We'll walk through preparing training data, running the training process, and evaluating results. You don't need thousands of examples or expensive infrastructure - in the case I'll present, 50+ training samples were enough to dramatically improve retrieval quality. You'll leave with a practical understanding of when and how to fine-tune embedding models, and what pitfalls to watch out for along the way.