Site Reliability Engineer

Searchability
Ashton-under-Lyne, United Kingdom
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 70K

Job location

Ashton-under-Lyne, United Kingdom

Tech stack

ASP.NET
.NET
Amazon Web Services (AWS)
Cloud Computing
Software Debugging
DevOps
Distributed Systems
Performance Tuning
Reliability Engineering
Prometheus
Web Platforms
Performance Testing
Grafana
Kubernetes
Terraform
Programming Languages

Job description

  • As a Site Reliability Engineer, I ensure the reliability, performance, and scalability of critical digital platforms. I monitor production systems, refine SLAs/SLOs and error budgets, design scalable solutions, and improve architecture through telemetry insights. I also build dashboards that provide clear visibility of system health. Additionally, I contribute to performance testing strategies and work closely with engineering, operations, and compliance teams to maintain high standards across the platform.

Technologies:

  • AWS
  • Cloud
  • Grafana
  • Kubernetes
  • Prometheus
  • Terraform
  • ASP.NET
  • DevOps

More:

We offer a salary of up to £70,000 and provide a hybrid working environment, with three days a week onsite in Greater Manchester. You will be joining a modern SRE environment equipped with cloud-native tooling such as AWS, Kubernetes, and Terraform, focused on high-availability digital platforms and performance-critical workloads.

As part of our team, you will gain exposure to modern cloud-native tooling and reliability practices, playing a high-impact role that supports major digital events. We pride ourselves on fostering a strong engineering culture, encouraging collaboration across product, operations, and platform teams.

Requirements

  • Sure! Here's the reformatted text in the requested sections and rewritten in the first person perspective.
  • --
  • Strong understanding of reliability engineering, scalable architectures, and performance optimization.
  • Experience with observability, debugging, and incident response.
  • Proficiency in a programming language for automation and tooling (GO or .NET preferred).
  • Cloud experience, ideally with AWS, and knowledge of container orchestration (Kubernetes) and Infrastructure as Code (Terraform).
  • Experience with monitoring and observability tools such as Grafana, Prometheus, or OpenTelemetry.
  • Strong understanding of networking fundamentals and distributed systems.
  • Ability to collaborate effectively with engineering, operations, and product teams.

Apply for this position