Principal SRE - Site Reliability

CBS Butler Limited
Wokingham, United Kingdom
2 days ago

Role details

Contract type
Contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 135K

Job location

Wokingham, United Kingdom

Tech stack

Java
Amazon Web Services (AWS)
Computing Platforms
Azure
Continuous Integration
DevOps
Distributed Systems
Python
Reliability Engineering
Software Engineering
Cloud Platform System
System Availability
Reliability of Systems
SC Clearance
Kubernetes
Infrastructure Automation Frameworks
Go

Job description

Role: Principal Site Reliability Engineer (Platform/DevOps)

Location: Wokingham (Reading) - Hybrid (60% remote/40% onsite) Duration: 6 months+ Rate: £500-£520 per day Clearance: Active SC Clearance required (mandatory)

Overview We are seeking an experienced Principal SRE/Platform Engineer to lead platform-first initiatives focused on scalability, reliability, and performance across distributed systems. This role requires strong DevOps expertise and the ability to design and maintain resilient cloud-based infrastructure.

Key Responsibilities

  • Lead platform-first engineering initiatives to enhance scalability and reliability
  • Design, build, and maintain resilient infrastructure for distributed systems
  • Implement monitoring and alerting solutions to ensure high availability
  • Collaborate with engineering teams to improve system reliability and mitigate risks
  • Develop and maintain CI/CD pipelines to support efficient deployments
  • Recommend ongoing improvements to platform architecture and processes
  • Ensure compliance with security, governance, and regulatory standards

Required Skills & Experience

  • Strong background in software engineering for large-scale distributed systems
  • Proficiency in Golang, Java, or Python
  • Hands-on experience with AWS, Azure, or GCP
  • Deep knowledge of Kubernetes and container orchestration
  • Proven experience with CI/CD and infrastructure automation
  • Excellent troubleshooting and communication skills

Requirements

We are seeking an experienced Principal SRE/Platform Engineer to lead platform-first initiatives focused on scalability, reliability, and performance across distributed systems. This role requires strong DevOps expertise and the ability to design and maintain resilient cloud-based infrastructure., * Strong background in software engineering for large-scale distributed systems

  • Proficiency in Golang, Java, or Python
  • Hands-on experience with AWS, Azure, or GCP
  • Deep knowledge of Kubernetes and container orchestration
  • Proven experience with CI/CD and infrastructure automation
  • Excellent troubleshooting and communication skills

Apply for this position