Nimrod Kor
The Limits of Prompting: ArchitectingTrustworthy Coding Agents
#1about 2 minutes
Prototyping a basic AI code review agent
A simple prototype using a GitHub webhook and a single LLM call reveals the potential for understanding code semantics beyond static analysis.
#2about 2 minutes
Iteratively improving prompts to handle edge cases
Simple prompts fail to consider developer comments or model knowledge cutoffs, requiring more detailed instructions to improve accuracy.
#3about 5 minutes
Establishing a robust benchmarking process for agents
A reliable benchmarking pipeline uses a large dataset, concurrent execution, and an LLM-as-a-judge (LLJ) to measure and track performance improvements.
#4about 2 minutes
Decomposing large tasks into specialized agents
To combat inconsistency and hallucinations, a single large task like code review is broken down into multiple smaller, specialized agents.
#5about 6 minutes
Leveraging codebase context for deeper insights
Moving beyond prompts, providing codebase context via vector similarity (RAG) and module dependency graphs (AST) unlocks high-quality, human-like feedback.
#6about 3 minutes
Introducing Awesome Reviewers for community standards
Awesome Reviewers is a collection of prompts derived from open-source projects that can be used to enforce team-specific coding standards.
#7about 1 minute
Key takeaways for building reliable LLM agents
The path to a reliable agent involves starting with a proof-of-concept, benchmarking rigorously, using prompt engineering for quick fixes, and investing in deep context.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
Sunhat
Köln, Germany
Remote
€85-115K
Senior
Team Leadership
Software Architecture
+1
Matching moments
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
01:32 MIN
Organizing a developer conference for 15,000 attendees
Cat Herding with Lions and Tigers - Christian Heilmann
03:28 MIN
Why corporate AI adoption lags behind the hype
What 2025 Taught Us: A Year-End Special with Hung Lee
06:28 MIN
Using AI agents to modernize legacy COBOL systems
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
03:48 MIN
Automating formal processes risks losing informal human value
What 2025 Taught Us: A Year-End Special with Hung Lee
03:38 MIN
Balancing the trade-off between efficiency and resilience
What 2025 Taught Us: A Year-End Special with Hung Lee
04:22 MIN
Why HR struggles with technology implementation and adoption
What 2025 Taught Us: A Year-End Special with Hung Lee
03:15 MIN
The future of recruiting beyond talent acquisition
What 2025 Taught Us: A Year-End Special with Hung Lee
Featured Partners
Related Videos
How we built an AI-powered code reviewer in 80 hours
Yan Cui
Three years of putting LLMs into Software - Lessons learned
Simon A.T. Jiménez
The AI Agent Path to Prod: Building for Reliability
Max Tkacz
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
Bringing the power of AI to your application.
Krzysztof Cieślak
Beyond Prompting: Building Scalable AI with Multi-Agent Systems and MCP
Viktoria Semaan
AI: Superhero or Supervillain? How and Why with Scott Hanselman
Scott Hanselman
Using LLMs in your Product
Daniel Töws
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

FRG Technology Consulting
Intermediate
Azure
Python
Machine Learning



Robert Ragge GmbH
Senior
API
Python
Terraform
Kubernetes
A/B testing
+3

Mindrift
Remote
£41K
Junior
JSON
Python
Data analysis
+1

Startup
Charing Cross, United Kingdom
PyTorch
Machine Learning


CareerValue
Hellevoetsluis, Netherlands
€3-5K
PHP
Python
Laravel
low-code
+1

CloudiQS
Remote
£70-106K
Senior
React
Python
Node.js
+5