Site Reliability Engineer
Role details
Job location
Tech stack
Job description
- As a Site Reliability Engineer, I ensure the reliability, performance, and scalability of critical digital platforms. I monitor production systems, refine SLAs/SLOs and error budgets, design scalable solutions, and improve architecture through telemetry insights. I also build dashboards that provide clear visibility of system health. Additionally, I contribute to performance testing strategies and work closely with engineering, operations, and compliance teams to maintain high standards across the platform.
Technologies:
- AWS
- Cloud
- Grafana
- Kubernetes
- Prometheus
- Terraform
- ASP.NET
- DevOps
More:
We offer a salary of up to £70,000 and provide a hybrid working environment, with three days a week onsite in Greater Manchester. You will be joining a modern SRE environment equipped with cloud-native tooling such as AWS, Kubernetes, and Terraform, focused on high-availability digital platforms and performance-critical workloads.
As part of our team, you will gain exposure to modern cloud-native tooling and reliability practices, playing a high-impact role that supports major digital events. We pride ourselves on fostering a strong engineering culture, encouraging collaboration across product, operations, and platform teams.
Requirements
- Sure! Here's the reformatted text in the requested sections and rewritten in the first person perspective.
- --
- Strong understanding of reliability engineering, scalable architectures, and performance optimization.
- Experience with observability, debugging, and incident response.
- Proficiency in a programming language for automation and tooling (GO or .NET preferred).
- Cloud experience, ideally with AWS, and knowledge of container orchestration (Kubernetes) and Infrastructure as Code (Terraform).
- Experience with monitoring and observability tools such as Grafana, Prometheus, or OpenTelemetry.
- Strong understanding of networking fundamentals and distributed systems.
- Ability to collaborate effectively with engineering, operations, and product teams.