Martin Beránek
SRE Methods In an Agency Environment
#1about 1 minute
Defining the core concepts of SLI, SLO, and SLA
Understand the foundational SRE terms: Service Level Indicators (SLI) as measurements, Service Level Objectives (SLO) as targets, and Service Level Agreements (SLA) as contracts with penalties.
#2about 6 minutes
Mapping SRE responsibilities in an agency-customer model
Clarify the roles and responsibilities between the agency, the customer, and the end-user clients using a simple relationship model.
#3about 4 minutes
Collaboratively creating SLO documentation with customers
Navigate the process of defining realistic SLOs with customers, from initial guessing games and benchmarking to periodic evaluation after launch.
#4about 2 minutes
Navigating the two primary application handover scenarios
Prepare for two distinct handover situations: when the customer has the capacity to take over operations versus when the agency retains responsibility for reliability.
#5about 3 minutes
The three essential SRE documents for agencies
Implement a "holy trinity" of documentation—the SLO document, support playbooks, and postmortems—to ensure clarity and operational readiness.
#6about 4 minutes
How to write effective and blameless postmortems
Structure postmortems to be detailed and blameless by including key sections like impact, root cause, resolution, action items, and a minute-by-minute timeline.
#7about 4 minutes
Defining key roles for effective incident management
Establish clear responsibilities during an incident by assigning an Incident Commander, Communications Lead, and Operations Lead to streamline resolution.
#8about 5 minutes
Managing unexpected costs from environment and security issues
Account for unexpected work from cloud provider changes and security vulnerabilities by using an error budget policy to assess impact and prioritize fixes.
#9about 4 minutes
Securely handing over credentials and application secrets
Execute a secure handover by properly managing user credentials in cloud environments like GCP and AWS and using secret managers for application secrets.
#10about 3 minutes
Finalizing the handover with documentation and tooling
Complete the project handover by sharing the essential SRE documents, explaining relevant tooling, and conducting an adoption period with the customer's team.
#11about 9 minutes
Key takeaways for applying SRE in an agency
Recognize that SRE is often underestimated, requires extensive explanation, and should ultimately focus on improving the user experience rather than just methodology.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
06:30 MIN
Applying agile and SRE principles to incident response
Applying Agile Principles to Incident Management
27:09 MIN
Actionable takeaways for SREs on incident management
Serverless Observability: where SLOs meet transforms
12:35 MIN
Understanding the "shift left" movement and developer responsibility
What Developers Get Wrong About Application Quality
24:30 MIN
Fostering cross-team collaboration with SLOs
Serverless Observability: where SLOs meet transforms
35:39 MIN
Q&A: How to scale quality practices in large teams
What Developers Get Wrong About Application Quality
15:11 MIN
Adopt a reliability mindset and plan for mistakes
Staying Safe in the AI Future
31:50 MIN
Q&A on setting realistic SLOs and choosing tools
Serverless Observability: where SLOs meet transforms
51:53 MIN
Q&A on shared systems and scaling productivity
Forget Developer Platforms, Think Developer Productivity!
Featured Partners
Related Videos
Platform Engineering vs. DevOps Why not both?
Christian Strack
Retooling and refactoring - an investment in people.
Andrew Holway
Shipping Quality Software In Hostile Environments
Luka Kladaric
We adopted DevOps and are Cloud-native, Now What?
Bruno Amaro Almeida
Hosting a modern justice system
Nevelina Aleksandrova
Hosting a modern justice system
Nevelina Aleksandrova
The journey from developer to devops - what i've learnt along the way
Liam Hurrel & Alireza Chegini
Plan CI/CD on the Enterprise level!
Pawel Piwosz
From learning to earning
Jobs that call for the skills explored in this talk.
Site Reliability Engineer SRE
VDart Software Services Pvt. Ltd.
Basildon, United Kingdom
€68K
Intermediate
Bash
Azure
Linux
DevOps
+10
DevOPS SRE AWS + Kubernetes
Plexus Group
Municipality of Madrid, Spain
DevOps
Kubernetes
Amazon Web Services (AWS)
Site Reliability Engineer (SRE) - Platform Infrastructure team (100% Remote - Spain)
Hopper
Municipality of Madrid, Spain
Remote
DNS
Bash
NoSQL
DevOps
+6
Site Reliability Developer (python/java) / SRE
WatchGuard
Municipality of Alcobendas, Spain
Java
JIRA
Azure
Spark
DevOps
+8
Site Reliability Engineer (SRE) Azure (CDI)
Codezys
Paris, France
Remote
Intermediate
Bash
Azure
DevOps
Python
+7
Graduate DevOps Engineer / SRE
RedTech Recruitment
Cambridge, United Kingdom
€35-70K
Junior
Azure
Linux
DevOps
Python
+6


