Senior Site Reliability Engineer | Azure Cloud
Role details
Job location
Tech stack
Job description
This is a high-impact opportunity to shape and promote Site Reliability Engineering within a large international enterprise environment that is still building out its SRE capability. You will take ownership of reliability across both legacy and modern Azure-based platforms, lead high-level incident problem solving, and help define how SRE should be embedded across engineering and leadership teams. Strong visibility, senior stakeholder exposure, and the chance to build long-term SRE maturity make this role especially attractive.
This is far more than a hands-on operational SRE position. The biggest challenge sits in stabilising a critical legacy application landscape where most incidents currently occur, while at the same time helping scale reliability practices across a newer Azure cloud stack.
The scope combines execution, leadership, and advisory responsibilities.
On the execution side, you will lead complex incident investigations, run detailed post-mortems, improve observability, and drive structural fixes that reduce repeat issues. You will work across monitoring platforms, incident tooling, and Azure services to improve resilience, alerting quality, automation, and service performance.
On the strategic side, this role requires someone who truly understands what mature SRE looks like in a cloud-first organisation. You will help senior managers understand where SRE adds value, advise on best practices, and help shape the future operating model for reliability engineering. This makes previous experience building, leading, or maturing an SRE function especially valuable.
This is an excellent fit for someone who wants to move beyond pure execution and have a real say in how SRE evolves inside a complex international business. You will have the opportunity to influence leadership, improve critical production systems, and help establish best-in-class reliability practices from the ground up.
Requirements
Are you a senior SRE who enjoys combining deep technical incident expertise with strategic influence?, * Senior-level SRE experience in cloud environments, ideally 5+ years
- Strong Azure platform knowledge across cloud operations, monitoring, and automation
- Proven experience leading incident post-mortems and root cause investigations
- Experience improving reliability in complex legacy application environments
- The ability to influence managers and senior stakeholders on SRE strategy and value
- Excellent communication skills in English, both technical and executive-facing
About you
- 5+ years in a senior SRE, reliability engineering, or cloud operations role
- Strong Azure cloud and hybrid infrastructure expertise
- Deep experience with incident management and post-mortem facilitation
- Background in legacy stack modernisation and production stability improvements
- Comfortable working on both strategic and hands-on topics
- Experience leading or scaling an SRE practice is a major plus
- Strong communication and stakeholder management skills
- Experience in enterprise or highly regulated environments is beneficial