Site Reliability Coach
Role details
Job location
Tech stack
Job description
The SRE Expert/Coach will lead the design, development, and delivery of advanced training programs and technical bootcamps to elevate the skills of aspiring SREs. This role is pivotal in driving SRE adoption, embedding best practices, and fostering a culture of reliability and automation across the organization. The coach will work closely with engineering, operations, and product teams to ensure training aligns with business needs and industry standards. Key Responsibilities
-
Training Program Design & Delivery o Develop progressive, hands-on training materials and bootcamp curricula for SRE fundamentals, intermediate, and advanced levels. o Deliver technical bootcamps (in-person and virtual) tailored to diverse audiences, including engineers, tech leads, and managers. o Facilitate workshops, awareness sessions, and embedded coaching to support SRE transformation journeys. o Customize training content for multiple technology stacks (AWS, Azure, GCP, Private Cloud) and organizational personas. o Assess learning needs, conduct capability gap analysis, and design targeted learning pathways.
-
Technical Leadership & Mentoring o Mentor and coach junior SREs and cross-functional teams on reliability engineering principles, automation, and incident management. o Guide teams in implementing Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets. o Promote best practices in monitoring, observability, and blameless postmortems.
-
Content Development & Continuous Improvement o Curate and create e-learning modules, assessments, and certification pathways. o Evaluate and iterate training materials based on feedback and evolving industry standards. o Collaborate with internal and external stakeholders to ensure training effectiveness and relevance., Salary, remote work... Define all the criteria that are important to you.
-
Get discovered Recruiters come directly to look for their future hires in our CV library.
-
Join a community Connect with like-minded tech and IT professionals on a daily basis through our forum.
Requirements
- Proven experience as a Site Reliability Engineer, SRE Coach, or similar role in large-scale cloud environments.
- Deep expertise in cloud infrastructure (AWS, Azure, GCP), automation (Terraform, Ansible, CloudFormation), and CI/CD pipelines.
- Strong background in incident response, root cause analysis, and reliability engineering.
- Experience designing and delivering technical training, bootcamps, or workshops for engineering teams.
- Excellent communication, facilitation, and mentoring skills.
- Ability to tailor content for multicultural and geographically distributed teams (UK, India).
- Familiarity with industry frameworks and best practices (Google SRE, DevOps, ITIL).
Certifications (Preferred)
-
SRE Foundation (DevOps Institute)
-
Google Professional SRE Certification
-
IBM Certified Professional SRE - Cloud v2
-
AWS/Azure/GCP Cloud Certifications
-
Any other relevant DevOps or Agile coaching certifications Education
-
Bachelor's or Master's degree in Computer Science, Engineering, or related field (or equivalent experience).