Nimrod Kor

Aug 20, 2025 • World Congress 2025

The Limits of Prompting: ArchitectingTrustworthy Coding Agents

Prompt engineering has its limits. Learn how a multi-agent architecture, enriched with deep context, boosted our AI agent's suggestion acceptance rate from 12% to over 60%.

#1about 2 minutes

Prototyping a basic AI code review agent

A simple prototype using a GitHub webhook and a single LLM call reveals the potential for understanding code semantics beyond static analysis.

#2about 2 minutes

Iteratively improving prompts to handle edge cases

Simple prompts fail to consider developer comments or model knowledge cutoffs, requiring more detailed instructions to improve accuracy.

#3about 5 minutes

Establishing a robust benchmarking process for agents

A reliable benchmarking pipeline uses a large dataset, concurrent execution, and an LLM-as-a-judge (LLJ) to measure and track performance improvements.

#4about 2 minutes

Decomposing large tasks into specialized agents

To combat inconsistency and hallucinations, a single large task like code review is broken down into multiple smaller, specialized agents.

#5about 6 minutes

Leveraging codebase context for deeper insights

Moving beyond prompts, providing codebase context via vector similarity (RAG) and module dependency graphs (AST) unlocks high-quality, human-like feedback.

#6about 3 minutes

Introducing Awesome Reviewers for community standards

Awesome Reviewers is a collection of prompts derived from open-source projects that can be used to enforce team-specific coding standards.

#7about 1 minute

Key takeaways for building reliable LLM agents

The path to a reliable agent involves starting with a proof-of-concept, benchmarking rigorously, using prompt engineering for quick fixes, and investing in deep context.

Sunhat
Köln, Germany

Remote

€65-95K

Senior

TypeScript

REST

+1

ROSEN Technology and Research Center GmbH
Osnabrück, Germany

Senior

TypeScript

React

+3

Wilken GmbH
Ulm, Germany

Senior

Amazon Web Services (AWS)

Kubernetes

+1

The evolution from prompt engineering to context engineering

03:29 MIN

The evolution from prompt engineering to context engineering

Engineering Productivity: Cutting Through the AI Noise

The limitations and frustrations of coding with LLMs

04:43 MIN

The limitations and frustrations of coding with LLMs

WAD Live 22/01/2025: Exploring AI, Web Development, and Accessibility in Tech with Stefan Judis

An overview of an AI-powered code reviewer

02:27 MIN

An overview of an AI-powered code reviewer

How we built an AI-powered code reviewer in 80 hours

Effective prompting and defensive coding for LLMs

03:31 MIN

Effective prompting and defensive coding for LLMs

Lessons Learned Building a GenAI Powered App

Why you need to prompt large language models like a child

02:33 MIN

Why you need to prompt large language models like a child

Developers vs Scammers, Bad Design, AI is Pointless, AJAX is 20 and more - The Best of LIVE 2025 - Part 1

Shifting from traditional code to AI-powered logic

02:58 MIN

Shifting from traditional code to AI-powered logic

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

The danger of over-engineering with LLMs

02:21 MIN

The danger of over-engineering with LLMs

Event-Driven Architecture: Breaking Conversational Barriers with Distributed AI Agents

Understanding when prompting fails and how LLMs process requests

04:56 MIN

Understanding when prompting fails and how LLMs process requests

The Power of Prompting with AI Native Development - Simon Maple

Featured Partners

How we built an AI-powered code reviewer in 80 hours

How we built an AI-powered code reviewer in 80 hours

Yan Cui

about 6 months ago • World Congress 2025

Three years of putting LLMs into Software - Lessons learned

Three years of putting LLMs into Software - Lessons learned

Simon A.T. Jiménez

about 6 months ago • World Congress 2025

Using LLMs in your Product

Using LLMs in your Product

Daniel Töws

about a year ago • World Congress 2024

Bringing the power of AI to your application.

Bringing the power of AI to your application.

Krzysztof Cieślak

about 2 years ago • World Congress 2024

The AI Agent Path to Prod: Building for Reliability

The AI Agent Path to Prod: Building for Reliability

Max Tkacz

about 6 months ago • World Congress 2025

Prompt Engineering - an Art, a Science, or your next Job Title?

Prompt Engineering - an Art, a Science, or your next Job Title?

Maxim Salnikov

about 2 years ago • World Congress 2024

Beyond Prompting: Building Scalable AI with Multi-Agent Systems and MCP

Beyond Prompting: Building Scalable AI with Multi-Agent Systems and MCP

Viktoria Semaan

about 6 months ago • World Congress 2025

Prompt Injection, Poisoning & More: The Dark Side of LLMs

Prompt Injection, Poisoning & More: The Dark Side of LLMs

Keno Dreßel

about 6 months ago • World Congress 2025

Related Articles

View all articles

DC

Daniel Cranney

Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3

In this, the third and final part of our series looking back on the best bits from the Weekly Developer Show, we dig into some more classic moments from our guests for you to enjoy. Raphael De Lio reminds us that contributing to open source - and sh...

Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3

DC

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

DC

Daniel Cranney

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

This week, we’re continuing our look-back on some of the best moments from the Weekly Developer Show from 2025. Here’s what some of our fantastic guests had to say… Sebastian Gingter cracked open the idea of “slopsquatting” and explained why we shou...

Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2

DC

Daniel Cranney

Developers vs Scammers, Bad Design, AI is Pointless, AJAX is 20 and more - The Best of LIVE 2025 - Part 1

Every Wednesday, we’re joined by guests from around the world to discuss all the going on in the tech industry, and now that the year is wrapping up, we thought we’d take some time to look back on some of our favourites conversations with these thoug...

Developers vs Scammers, Bad Design, AI is Pointless, AJAX is 20 and more - The Best of LIVE 2025 - Part 1

From learning to earning

Jobs that call for the skills explored in this talk.

MVP Builder / Technical Venture Builder (f/m/d)

TheVentury FlexCo
Vienna, Austria

€47-51K

Intermediate

Senior

AI Frameworks

AI-assisted coding tools

Prompt Engineer - AI Systems & Automation Specialist

Ai-generated

£52K

Intermediate

Python

A/B testing

Senior Generative AI Engineer (LLMs)

Provectus IT, Inc.

Remote

Spark

Python

Docker

AWS Lambda

+2

Senior AI Engineer - LLMs & Agentic Systems (all genders)

Robert Ragge GmbH

Senior

API

Python

Terraform

Kubernetes

A/B testing

+3

GenAI Developer - Prompt Engineering & Data Workflows

Mindrift

Remote

£41K

Junior

JSON

Python

Data analysis

+1

Software Engineer / Tech Lead - AI Workflows & Platform (LLM Orchestration)

In Space BV
Eindhoven, Netherlands

Remote

€5-7K

Senior

Continuous Integration

AI Engineer - Agentic Systems & Backend

Landbot
Barcelona, Spain

Remote

API

Principal AI Technical Architect

Logiq's Architecture Practice
Bristol, United Kingdom

ETL

Azure

DevOps

Python

Gitlab

+5

Java Entwickler AI-Assisted Coding

LDB GmbH
Berlin, Germany

Remote

CSS

Java

SASS

MySQL

+5