Skip to content

AI Agents

5 things I wish I hadn’t done building my AI agent

with Shachar Azriel

Thursday 9 July 18:10 – 18:40 Stage 3 - powered by AWS

About This Session

Most talks about AI agents focus on success stories and best-case outcomes. This talk is about what can actually go wrong when you ship an scale-up AI agent in a start-up. Over the past 18 months, our team in Baz built and scaled an AI-powered Code Review Agent used daily by thousands of across the world. To move fast in this crazy market, we made several architectural, product, and UX decisions that seemed reasonable at the time, but later turned into expensive mistakes. Some cost us users, and some hit our precious revenue. In this session, I’ll share five concrete pitfalls we encountered while building a real AI coding agent, why they happened, how we detected them, and the pivots that ultimately worked. This is not a theoretical talk: every example comes from a production system, and will include real system diagrams, usage data, and how the fixes changed behavior in production.(alongside a lot of self humor :) 1. We built a “smarter” agent, and it got "dumber" Why adding more context, tools, and responsibilities reduced accuracy instead of improving it 2. We let users choose the model, and lost control of the results How exposing LLM choice destroyed consistency and meaningful feedback 3. We optimized for an AI app, not for developer behavior Why real adoption only starts when the agent lives where decisions were already being made (GH, GL or the IDE) 4. Our guardrails worked, until the providers changed the models How silent model updates broke engineering assumptions and eroded user trust 5. Our metrics looked great, but users were still churning Why industry-standard AI metrics (like accepted suggestions and time-to-merge) missed the signal that actually won (or lost) customers

Topics

  • Agentic AI
  • Best Practices
  • Startups