About This Session
AI is moving from the lab into production, but traditional monitoring can’t keep up. A system might look perfect on a dashboard while it hallucinates, makes poor decisions, or loops through expensive token calls. These are behavioral failures, not technical bugs, and they often stay hidden inside the black box. This talk looks at the practical hurdles of scaling AI. We will discuss why teams are currently debugging in the dark and how silent failures in agentic workflows break user trust. Observability is key to shift from tracking system health to evaluating decisions. We will look at how end-to-end visibility into LLM operations, including prompt tracking and tool usage, provides the data needed for reliability and cost control. This session offers a strategy for building the observability foundations required to scale enterprise AI safely and predictably
Topics
- AI Standards
- Agentic AI
- Compliance
- DevOps
- Large Language Models (LLMs)
- LLMOps
- Observability
- Reliability
- Site Reliability Engineering (SRE)