Site Reliability Engineer

McGregor Boyall Associates Ltd.
Charing Cross, United Kingdom
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 95K

Job location

Charing Cross, United Kingdom

Tech stack

Artificial Intelligence
Bash
Cloud Computing
Cloud Storage
Databases
Continuous Integration
Python
Reliability Engineering
Software Engineering
Data Streaming
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Istio
Kubernetes
Api Design
Dynatrace

Job description

We're seeking an experienced Site Reliability Engineer to join the Cloud Enabling team to play a crucial role in maturing our SRE capability and contributing to the resiliency, availability, and security of our infrastructure and software.

Day to day:

  • Support systems that serve customers and billions of requests monthly, ensuring availability, scalability, and resiliency.
  • Act as a key technical contributor in liaising with SRE guilds to drive improvements in cloud deployments, monitoring solutions, CI/CD pipelines, and cost optimisation.
  • Drive innovation by exploring new technologies and methodologies to enhance SRE capabilities, including AI tooling and automation opportunities.
  • Manage high-throughput systems in production to deliver customer value beyond proof-of-concepts.
  • Implement SLAs/SLOs/SLIs for software and data teams.
  • Develop tooling for efficient incident triage, granular alerting, well-defined runbooks, and auto-resolving mechanisms.
  • Serve as a subject matter expert in engineering conversations related to site reliability, fostering a culture of continuous learning and development.

Requirements

  • Proven hands-on experience in software development, testing, monitoring, and operational stability at scale.
  • Production experience with Kubernetes and monitoring tools such as Datadog or Dynatrace.
  • Strong knowledge of automation, CI/CD, and best practices.
  • Experience running postmortems, defining SLAs/SLIs/SLOs, and participating in support rotas.
  • Coding/scripting experience (Python/Bash) in a commercial setting.
  • Database knowledge, streaming and batch operations, and API design.
  • Good background with Kubernetes (ideally microservice architectures using Istio service mesh).
  • Extensive experience with cloud-native solutions (ideally Google Cloud).
  • Solid understanding of cloud storage, networking, and resource provisioning.

Senior Site Reliability Engineer, Containerisations, Pipeline, GCP, Cloud

About the company

McGregor Boyall is a privately owned global recruitment consultancy founded in 1987. We are headquartered in the City of London, with additional offices covering the UK & Europe, the Middle East and North America. We provide permanent, contract and project-based recruitment services focusing on the mid-senior candidate market. We deliver within Financial Services, Commerce & Industry, and the Public Sector. Our primary specialisms cover all technology verticals and core business functions.

Apply for this position