Site Reliability Engineer

VanHack
24 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote

Tech stack

Amazon Web Services (AWS)
Data analysis
Azure
Reliability Engineering
Prometheus
Grafana
Reliability of Systems
Kubernetes
Terraform
Jenkins

Job description

We are seeking a Senior Site Reliability Engineer (f/m/d) to join our team in Barcelona, supporting the continued growth and stability of our cloud-based infrastructure.

In this role, you will be responsible for ensuring the reliability, scalability, and efficiency of our systems across multiple cloud platforms. You'll develop and enhance monitoring, observability, and automation frameworks, driving improvements in uptime and performance. Collaborating closely with development, security, and product teams, you'll help strengthen system resilience and optimize incident response and reliability processes using data-driven insights.

Requirements

  • Strong experience with cloud platforms such as AWS, GCP, Linode, or Azure
  • Proficiency with container orchestration technologies (Kubernetes, EKS, or GKE)
  • Hands-on experience with observability tools (Grafana-Prometheus stack)
  • Skilled in CI/CD pipelines and infrastructure-as-code (Jenkins, Argo, Terraform)
  • Analytical, proactive, and passionate about continuous improvement and system reliability

Apply for this position