Site Reliability Engineer - AWS & Azure

Square One Resources Limited
Kilsby, United Kingdom
2 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 157K

Job location

Kilsby, United Kingdom

Tech stack

Amazon Web Services (AWS)
Azure
Cloud Computing
Computer Engineering
DevOps
Monitoring of Systems
Information Technology Operations
Log Analysis
Reliability Engineering
Software Engineering
Data Processing
Containerization
Kubernetes
Information Technology
Terraform
Docker

Job description

We are seeking a highly skilled Site Reliability Engineer (SRE) with expertise in both Azure and AWS cloud platforms. This position is responsible for taking a lead role in migrating an existing on-prem HPC solution into Cloud, enhancing the reliability, scalability, and performance of that cloud infrastructure through automation, software engineering practices, and proactive system management. The ideal candidate will bridge the gap between development and operations, applying a software engineering mindset to IT operations and infrastructure., * Work with existing solutions already in place in the US to redefine, implement, and maintain scalable, reliable cloud infrastructure across Azure and AWS for the UK business as a similar but separate entity.

  • Develop automation scripts and tools to streamline operational tasks such as log analysis, environment testing, and incident response.
  • Collaborate with development and operations teams to ensure seamless deployment and performance of applications and services.
  • Monitor system performance and availability, proactively identifying and resolving issues.
  • Apply software engineering principles to infrastructure management, improving efficiency and reducing manual effort.
  • Deliver value by monitoring spending, optimizing resource usage, right-sizing and automation, and implement governance through tagging strategies and budget alerts.
  • Document the solution and deliver knowledge transfer and training to existing team members.

Requirements

  • Strong understanding of cloud-native architectures and services in Azure and AWS including AKS/EKS and it's automation.
  • Experience with infrastructure-as-code tools (eg, Terraform).
  • Familiarity with CI/CD pipelines, containerization (Docker, Kubernetes), and monitoring tools.
  • Knowledge of data processing and configuration design.
  • Experience with IT infrastructure and monitoring systems., * Bachelor's degree in Computer Science, Computer Engineering, Information Technology, or a related field.
  • Extensive experience in site reliability engineering, DevOps, or cloud infrastructure roles.

About the company

Square One is acting as both an employment agency and an employment business, and is an equal opportunities recruitment business. Square One embraces diversity and will treat everyone equally. Please see our website for our full diversity statement.

Apply for this position