About This Session
As engineers, we're already familiar with classic site reliability patterns such as health checks, circuit breakers, retries, and failover. In this talk, we'll explore how GitHub Copilot's platform layer adapts these concepts to an inference-specific world. We'll cover AI-specific health signals (tokens-per-minute, time-to-first-token, turn success rate), routing and failover across model endpoints, and additional availability levers like GPU capacity tuning and auto model selection.
Topics
- AI Models
- Best Practices
- Copilot
- Distributed Systems
- GitHub
- Large Language Models (LLMs)
- Load Balancing
- Multi-Cloud
- Reliability
- Scaling
- Site Reliability Engineering (SRE)