Site Reliability Engineer

Experis
5 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 100K

Job location

Tech stack

Java
Amazon Web Services (AWS)
Azure
Cloud Computing
Disaster Recovery
Distributed Systems
Python
Release Management
Reliability Engineering
Software Engineering
System Availability
Reliability of Systems
Kubernetes
Go
Programming Languages

Job description

Detect and mitigate system issues to ensure high availability. Automate operational tasks to improve efficiency and reduce manual intervention. Prepare disaster recovery plans and ensure business continuity. Monitor system health and optimize performance. Collaborate with development teams to enhance system reliability. Implement CI/CD pipelines for seamless deployment and release management. Ensure compliance with security standards, governance policies, and regulatory requirements.

Requirements

Expertise in software development and engineering for large-scale distributed systems. Strong proficiency in programming languages such as Golang, Java, or Python. Extensive experience with cloud infrastructure providers (AWS, Azure, or GCP). Deep knowledge of container orchestration platforms like Kubernetes. Exceptional problem-solving skills and a passion for building scalable, secure solutions. Excellent communication skills to collaborate with cross-functional teams.

Apply for this position