Senior SRE

Anson McCade
Glasgow, United Kingdom
12 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 85K

Job location

Glasgow, United Kingdom

Tech stack

Artificial Intelligence
Bash
Cloud Computing
Computer Programming
Continuous Integration
Linux
DevOps
Distributed Systems
Python
Reliability Engineering
Ansible
Data Logging
Scripting (Bash/Python/Go/Ruby)
Cloudformation
Terraform
Go

Job description

Resilience Lead and participate in incident response, root cause analysis and blameless post-mortems. Use data and observability to reduce mean time to detect and resolve. Drive improvements through SLOs, error budgets and reliability metrics. Automation & Engineering Excellence Develop automation and tooling using scripting and programming to remove toil. Build CI/CD, infrastructure-as-code and self-service capabilities. Champion continuous improvement through experimentation and measurement. Security & Risk Ensure secure configuration and operation of platforms. Embed security controls and resilience patterns into infrastructure by design. Collaboration & Leadership Work closely with product, architecture and engineering teams to define platform requirements and solutions. Influence technical direction, mentor engineers, and help mature SRE practices across the organisation. What we're looking for Strong experience in Site Reliability Engineering, DevOps, or Platform Engineering. Solid

Requirements

programming and scripting skills (e.g. Python, Go, Bash). Deep understanding of Linux, networking, distributed systems and cloud platforms. Experience with infrastructure-as-code and automation (e.g. Terraform, Ansible, CloudFormation). Strong incident response, troubleshooting and fault-analysis skills using a scientific, data-driven approach. Experience with observability: metrics, logging, tracing, alerting and performance analysis. Ability to explain complex systems clearly and influence across technical and non-technical stakeholders. Nice to have Experience driving SRE maturity (SLOs, error budgets, reliability reviews). Exposure to large-scale, regulated or high-availability environments. Interest in using AI/ML to improve operations, monitoring or incident response. Passion for teaching, mentoring and building engineering culture. Why join Work on large-scale, business-critical platforms with real impact. Influence reliability, resilience and engineering standards across a complex organisation. Competitive salary (£65k-£86k) plus bonus and strong benefits. Hybrid working with base locations in Glasgow or Greater Manchester. If you're an experienced SRE or platform engineer who enjoys solving complex reliability problems, building automation, and shaping how modern infrastructure is operated at scale, we'd love to hear from you.

Apply for this position