Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
You will operate at the intersection of software engineering, cloud infrastructure and reliability engineering. This role goes beyond execution and delivery. You will be expected to design, plan and lead initiatives , shaping how reliability, observability and incident management are implemented across the organisation.
You will partner closely with engineering teams, influence architectural decisions early, and help define how reliability is measured and improved as the platform scales.
Requirements
-
Led initiatives across multiple teams or domains rather than working solely within one squad
-
Designed and evolved systems with clear reasoning around trade offs, failure modes and long term impact
-
Strong communication skills and confidence presenting technical decisions in larger group settings
-
Experience in scale ups or mid sized tech environments where structure is still evolving and ownership is high
Technical background
You bring strong depth across:
-
Cloud infrastructure, ideally AWS, with solid networking and service level understanding
-
Containers and orchestration such as Kubernetes, ECS or similar
-
Infrastructure as Code using tools like Terraform, Pulumi or CloudFormation
-
Observability and monitoring including metrics, logging and alerting using tools such as Prometheus, Grafana, DataDog or CloudWatch
-
CI CD and automation practices with a focus on reliability and safety
You also have a strong software engineering background , with experience building and operating systems in languages such as Python, Node.js, Ruby or similar, not just scripting.
Reliability mindset
You are comfortable with:
-
Defining and using SLOs and SLIs to make reliability measurable
-
Using error budgets to guide engineering priorities
-
Leading or participating in incident response and post incident improvement
-
Improving production readiness, on call quality and reducing recurring failure patterns
Why this role stands out
-
High impact senior role with real ownership and influence
-
Opportunity to shape reliability practices in a growing engineering organisation
-
Strong engineering culture with an emphasis on autonomy and trust, If you are a senior engineer who enjoys designing systems, leading initiatives and improving reliability at scale, this role offers the scope and autonomy to make a real impact.
Benefits & conditions
- Competitive salary, equity and a flexible hybrid working model