Senior SRE GCP

Robert Walters
Charing Cross, United Kingdom
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 90K

Job location

Charing Cross, United Kingdom

Tech stack

API
Artificial Intelligence
Bash
Cloud Computing
Databases
Python
Reliability Engineering
Software Engineering
Datadog
Google Cloud Platform
Istio
Reliability of Systems
Kubernetes
Build Tools
Dynatrace

Job description

As a Senior Site Reliability Engineer, you will be responsible for supporting high-throughput systems that serve millions of customers and billions of requests each month. You'll work on complex hybrid-cloud architectures, with a focus on Kubernetes-based workloads, networking, and monitoring solutions., As a Senior Site Reliability Engineer, you will be responsible for supporting high-throughput systems that serve millions of customers and billions of requests each month. You'll work on complex hybrid-cloud architectures, with a focus on Kubernetes-based workloads, networking, and monitoring solutions.

You'll also have the opportunity to drive improvements across cloud deployments, CI/CD pipelines, and cost optimisation while exploring new technologies and automation opportunities. Acting as a subject matter expert in site reliability engineering, you'll help foster a culture of continuous learning within the team., * Ensure critical systems are highly available, scalable, and resilient.

  • Develop and implement SLAs/SLOs/SLIs to enhance system reliability.
  • Build tools to improve incident management processes, including alerting mechanisms, runbooks, and auto-resolving solutions.
  • Drive innovation by exploring AI tooling and automation to improve SRE capabilities.
  • Collaborate with teams to optimise cloud deployments and monitoring solutions.
  • Actively participate in postmortems and support rotas to ensure operational excellence.

Requirements

We're seeking candidates with:

  • Proven experience in software development, testing, monitoring, and operational stability at scale.
  • Expertise in Kubernetes (ideally microservice architectures using Istio service mesh).
  • Strong knowledge of cloud-native solutions (preferably Google Cloud), including storage, networking, and resource provisioning.
  • Hands-on experience with monitoring tools such as Datadog or Dynatrace.
  • Proficiency in coding/scripting languages such as Python or Bash.
  • A solid understanding of automation best practices and CI/CD pipelines.
  • Experience designing APIs and working with database operations (streaming/batch).

Apply for this position