Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Lead Site Reliability Engineer - Public Cloud Platform Location: Halifax, Leeds or Manchester, We're looking for a Lead Site Reliability Engineer (SRE) to join our Public Cloud Platform , supporting both GCP and Azure . In this role you'll help strengthen observability, reliability, and operational excellence across our cloud estate-enabling our ambition to become the UK's leading FinTech.
You'll work closely with Product Owners and Engineering Leads to embed SRE principles, lead a team of up to 15 SREs, and champion a culture of learning, automation and continuous improvement.
What You'll Be Doing
Lead, coach and develop a high-performing SRE team, fostering autonomy, inclusion and continuous improvement.
Partner with Product Owners and Engineering Leads to embed reliability into roadmaps, backlogs and delivery decisions.
Apply SRE principles (SLIs, SLOs, error budgets) to ensure our services remain highly reliable, performant and scalable.
Drive improvements in observability-across metrics, logs, traces and events-ensuring services are observable by design.
Use Dynatrace as the primary observability platform for significant dashboards and customer-centric alerting.
Own Infrastructure-as-Code and CI/CD-based environments, implementing enhancements and responding to operational change.
Lead coordination of incident response and root cause analysis, supporting teams through major incidents, post-incident reviews and prevention of recurrence.
Collaborate with multi-disciplinary engineering teams to remove technical impediments, reduce toil and improve service operability.
Contribute hands-on engineering where needed, validating technical decisions and guiding best practice.
Bring an approach of curiosity, experimentation, and first-principles thinking to evolve our engineering culture., JOB TITLE: Senior Site Reliability Engineer (Public Cloud)SALARY: £72,702 - £80,780LOCATION(S): LeedsHOURS: Full-time - 35 hours per weekWORKING PATTERN: Our work style is hybrid, which involves spending at least two days per week, or 40% of our time, at our Leeds..., Lead Cloud Solutions Architect, (Remote, Client Facing - EU/UK) Full-time preferred | Consultants considered (40 hrs/week) Start: ASAP A rapidly growing technical consultancy is seeking a Lead Cloud Solutions Architect to design and support cloud-based computing..., You will lead a cross-functional DevOps team responsible for continuous development of new features and for supporting live operations (L2 and L3) of a mission-critical service classified as Critical National Infrastructure (CNI). The service operates under strict SLA/SLO..., A leading defense technology firm in Leeds seeks a Site Reliability Engineer to enhance system reliability and performance. The role involves supporting essential services and automating systems within a hybrid working environment. Successful candidates will have expertise..., Job Title: GCP SRE Location: Manchester, Leeds or Halifax (Hybrid 3 days a week) Duration: 6 months and expendable We're looking for a Google Product Site Reliability Engineer to join our Public Cloud Platform. You'll have a unique opportunity to be part of an ambitious..., Skills-Cloud experience -GCP Infrastructure as Code (Terraform) CI/CD tooling: Jenkins, Spinnaker, Harness API design/integration/documentation Kubernetes (GKE desirable) Software development: NodeJS, Bash, Python, Job Description: Cloud Architect - Azure, DevOps, Terraform (with Technical Account Management Focus) Position: Cloud Architect Location: Remote (UK-based) Type: Full-time We are seeking a skilled and client-focused Cloud Architect with deep expertise in Azure, DevOps,..., Location(s): UK, Europe & Africa : UK : Leeds BAE Systems Digital Intelligence is home to 4,500 digital, cyber and intelligence experts. We work collaboratively across 10 countries to collect, connect and understand complex data, so that governments, nation states, armed..., A leading defense technology firm in the United Kingdom is hiring a Site Reliability Engineer to enhance system quality, reliability, and performance. This mid-senior level role combines software engineering and operational support in a collaborative environment. Expect to...
Requirements
Proven experience applying SRE practices within Azure, GCP, or both.
Strong understanding of SLIs, SLOs, error budgets , and how to use these to guide product and engineering decisions.
Experience ensuring reliability of production services, including availability, performance and recoverability.
Hands-on or leadership experience in incident and problem management , focused on reducing MTTR and avoiding repeat issues.
Background in software engineering or cloud engineering, with good understanding of modern SDLC practices.
Practical experience with DevOps, CI/CD and automation to improve service reliability.
Experience improving observability on complex, distributed systems.
Ability to use data to influence prioritisation and balance reliability with feature delivery.
Collaboration and communication skills, working effectively with product, engineering and platform teams.
Experience mentoring engineers and promoting inclusive, supportive team culture.
Desirable Skills
Certifications or strong experience with Google Cloud Platform and/or Microsoft Azure .
Knowledge of Kubernetes, compute services, API management and large-scale distributed systems.
Experience with Terraform , Jenkins , or equivalent configuration/pipeline tooling.
Ability to write and maintain scripts or code in languages such as Python, Bash, PowerShell or Groovy.
Solid grasp of cloud networking, security, and systems built around APIs.
Experience with Infrastructure as Code, modular design and scalable automation patterns.
About YouYou're someone who:
Is passionate about building resilient, observable, customer-focused platforms.
Enjoys coaching others, sharing knowledge and shaping engineering culture.
Looks for opportunities to remove toil and introduce automation.
Thrives in collaborative, multi-functional environments.
Adopts new tools, technologies and modern engineering approaches.
Values diverse perspectives, psychological safety and inclusive ways of working., Skills Required:Experience in CloudFormation and AnsibleKnowledge of containers (ECS or Kubernetes)Proficiency in scripting with Bash or PowerShellFamiliarity with at least one higher-level language like Python or TypeScriptPreferred Background: Experience in Windows and/or..., A leading food manufacturer in Leeds is seeking a proactive Reliability Engineer for Site Services & Utilities. You will ensure the safe and efficient operation of critical utility assets, focusing on reliability and compliance. The ideal candidate will have a degree in...
Benefits & conditions
Salary: £90,440- £106,400
Working Pattern: Hybrid (2 days in office per week)
About the OpportunityAt Lloyds Banking Group, our purpose is to Help Britain Prosper. As we continue redefining into a modern, innovative, purposeful organisation, we're investing heavily in cloud, automation and engineering excellence across our platforms., What You'll Get in ReturnWe're committed to building a truly inclusive workplace where everyone can grow, thrive and make a meaningful impact. As part of LBG, you'll also receive:
A competitive salary and performance-related bonus
28 days holiday plus bank holidays
Generous pension contribution
Private medical insurance
Flexible benefits to suit your lifestyle
Hybrid working model and family-friendly policies
Access to wellbeing support, training and career development