Daniel Savenkov
Beyond Autocomplete: Local AI Code Completion Demystified
#1about 6 minutes
The case for local AI code completion
While cloud-based AI offers powerful models, a local approach provides better security, lower latency, and no subscription cost by using smaller, specialized models.
#2about 4 minutes
Measuring user experience with online A/B testing
Online evaluation uses A/B testing to measure positive signals like code generation and negative signals like user annoyance to validate feature improvements.
#3about 2 minutes
Guaranteeing code correctness with semantic checks
Suggestions are validated for semantic correctness by the IDE before being shown to the user, eliminating errors like non-existent variables.
#4about 3 minutes
Using a filter model to reduce user annoyance
A secondary machine learning model predicts the probability of a suggestion being accepted, filtering out suggestions that are correct but unhelpful.
#5about 2 minutes
Implementing efficient local model inference
Using a native C++ inference engine like Llama.cpp enables fast, low-level execution of the language model directly on the user's machine.
#6about 2 minutes
Training small, specialized language models from scratch
Training small, language-specific models in-house is cost-effective and allows for extensive experimentation to optimize performance for local execution.
#7about 2 minutes
Accelerating development with offline evaluation
An offline evaluation pipeline runs the IDE in a headless mode to test hypotheses quickly, pre-selecting the most promising changes for slower A/B tests.
#8about 1 minute
Structuring the team for a local AI feature
A small, cross-functional team of 10-20 people with diverse skills is an effective structure for delivering complex AI features.
#9about 6 minutes
Key takeaways for building local AI features
Local AI is a rapidly growing field where success depends on more than just the language model, requiring a focus on security and user experience.
Related jobs
Jobs that call for the skills explored in this talk.
Machine Learning Engineer
Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
Python
Structured Query Language (SQL)
+1
Matching moments
02:43 MIN
Using local AI models for code assistance
Building APIs in the AI Era
20:08 MIN
Demo: Setting up a local AI code assistant
Self-Hosted LLMs: From Zero to Inference
00:19 MIN
The developer's journey for building AI applications
Supercharge your cloud-native applications with Generative AI
12:58 MIN
Strategies for integrating local LLMs with your data
Self-Hosted LLMs: From Zero to Inference
26:34 MIN
Q&A on AI limitations and practical application
How to become an AI toolsmith
00:04 MIN
The case for running AI models locally
Prompt API & WebNN: The AI Revolution Right in Your Browser
02:40 MIN
Contrasting incremental AI with AI-native development
Transforming Software Development: The Role of AI and Developer Tools
02:37 MIN
Key benefits of local LLM deployment for developers
Self-Hosted LLMs: From Zero to Inference
Featured Partners
Related Videos
Livecoding with AI
Rainer Stropek
Bringing the power of AI to your application.
Krzysztof Cieślak
Self-Hosted LLMs: From Zero to Inference
Roberto Carratalá & Cedric Clyburn
Building APIs in the AI Era
Hugo Guerrero
How we built an AI-powered code reviewer in 80 hours
Yan Cui
AI: Superhero or Supervillain? How and Why with Scott Hanselman
Scott Hanselman
Speak, Code, Deploy: Transforming Developer Experience with Voice Commands
Sami Ekblad
Agents for the Sake of Happiness
Thomas Dohmke
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

AI Systems and MLOps Engineer for Earth Observation
Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning

Senior Machine Learning Engineer (IntelliJ AI) New
JetBrains
Amsterdam, Netherlands
Senior
Python
Kotlin
IntelliJ
Machine Learning


Conversational AI & Machine Learning Engineer
Deloitte
Leipzig, Germany
Azure
DevOps
Python
Docker
PyTorch
+6


Student project: Optimizing Open-set Recognition Methods for Reliable Real-world AI Systems
Imec
Azure
Python
PyTorch
TensorFlow
Computer Vision
+1


Deep Learning Engineer for Language Technologies (RE2)
Barcelona Supercomputing Center
Barcelona, Spain
Intermediate
Python
PyTorch
Machine Learning

Remote ML Researcher - Benchmarking & Real-World AI
Taskify AI
Municipality of Madrid, Spain
Remote
Machine Learning