Site Reliability Engineer

Anson McCade
Manchester, United Kingdom
11 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 60K

Job location

Manchester, United Kingdom

Tech stack

Amazon Web Services (AWS)
Systems Engineering
Azure
Cloud Computing
Databases
DevOps
Microsoft SQL Server
Performance Tuning
Powershell
Reliability Engineering
Ansible
Software Engineering
Scripting (Bash/Python/Go/Ruby)
System Availability
Performance Monitor

Job description

Are you passionate about building resilient systems and eliminating operational toil through automation? We're looking for a Site Reliability Engineer (SRE) to join our high-impact team and help shape the future of our digital infrastructure.

As an SRE, you'll blend software engineering with systems engineering to ensure the reliability, availability, and performance of our platforms. You'll work on mission-critical systems, drive automation at scale, and collaborate across teams to embed reliability into every layer of our technology stack.

What You'll Do

  • Ensure the availability, scalability, and performance of systems through proactive monitoring and capacity planning.
  • Lead incident response, root cause analysis, and implement preventive measures to avoid recurrence.
  • Develop automation tools and scripts to reduce manual operations and improve system resilience.
  • Optimize system performance and resource usage, identifying and resolving bottlenecks.
  • Collaborate with development and product teams to integrate SRE best practices into the software lifecycle.
  • Contribute to the evolution of our SLIs, SLOs, and error budgets to drive reliability metrics.
  • Stay current with industry trends and contribute to our internal engineering communities.

Requirements

  • Proven experience as an SRE, DevOps Engineer, or Systems Engineer in a complex, high-availability environment.
  • Deep expertise in Microsoft SQL Server (2016-2022), including performance tuning, high availability, and architecture.
  • Strong scripting skills (e.g., PowerShell) and experience with automation/configuration tools like Ansible or Chef.
  • Familiarity with observability tools, monitoring frameworks, and incident management practices.
  • A mindset focused on eliminating TOIL, improving developer experience, and scaling operations through code.
  • Excellent communication and collaboration skills.

Bonus Points

  • Experience with cloud platforms (Azure, AWS, or GCP).
  • Background in database automation and estate standardization.
  • Knowledge of security and compliance in regulated environments.

Benefits & conditions

  • Be part of a forward-thinking engineering culture that values innovation, learning, and collaboration.
  • Access to cutting-edge tools and technologies.
  • Competitive compensation, benefits, and career growth opportunities.

Apply for this position