Site Reliability Engineer

Ranger Technical Resources
Charing Cross, United Kingdom
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Charing Cross, United Kingdom

Tech stack

Amazon Web Services (AWS)
Application Performance Management
Bash
Cloud Computing
Information Systems
Continuous Integration
Linux
DevOps
Disaster Recovery
Fault Tolerance
Python
Linux System Administration
Reliability Engineering
Ansible
Datadog
Scripting (Bash/Python/Go/Ruby)
Autoscaling
Infrastructure as Code (IaC)
Information Technology
Amazon Web Services (AWS)
Route53
Splunk
New Relic (SaaS)
Go

Job description

  • Implement advanced AWS features (Route53, ALB / NLB, multi-region setups) to ensure global reliability.
  • Maintain and optimize the existing CI / CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features.
  • Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes.
  • Implement, manage, and enhance monitoring tools to proactively detect and resolve system issues.
  • Administer and optimize Linux-based servers and applications, ensuring stability, performance, and security.
  • Implement and manage containerization solutions to improve scalability and efficiency.
  • Implement security best practices across AWS environments, ensuring compliance with industry standards and safeguarding cloud infrastructure.
  • Develop automated incident response mechanisms and self-healing solutions to minimize downtime and enhance fault tolerance.
  • Diagnose and resolve infrastructure, networking, and application-related performance issues to ensure operational efficiency.
  • Ensure business continuity by designing and maintaining robust backups, failover strategies, and disaster recovery solutions.
  • Identify, diagnose, and resolve infrastructure or application performance bottlenecks.
  • Create real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends.
  • Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Seniority Level

Requirements

  • Bachelor or higher degree in Computer Science, Information Systems, Information Technology, or a related technical field / experience.
  • 10+ years of experience in Site Reliability Engineering, DevOps, Infrastructure, or related roles.
  • Deep understanding of AWS and its various modules and services.
  • Strong background in Linux administration and troubleshooting.
  • Proven experience in implementing and managing CI / CD pipelines and Infrastructure as Code (IAC) solutions.
  • Proven experience in monitoring and observability tools to proactively manage system health.

Skills and Strengths

  • AWS (Amazon Web Services)
  • Auto Scaling
  • Fargate
  • Route53
  • Observability tools (New Relic, DataDog, Splunk)
  • Scripting (Ansible, Bash, Python, GO)
  • CI / CD

Apply for this position