Site Reliability Engineer

PayRetailers

Barcelona, Spain

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Remote

Barcelona, Spain

Tech stack

API

Artificial Intelligence

Application Performance Management

Bash

Cloud Computing

Python

PostgreSQL

Microsoft SQL Server

MongoDB

Software Architecture

Reliability Engineering

Prometheus

Grafana

Nintex

Job description

Site Reliability Engineers are the guardians of our reliability promise. They deliver a highly reliable, resilient, and cost-efficient platform that consistently meets business and customer expectations for availability and performance., * Increase automation of operational activities to reduce downtime risk, in collaboration with Platform Engineering and Domain Squads.

Drive systemic improvements across engineering teams based on incident RCAs and telemetry insights.
Implement non-functional improvements (resilience, performance, reliability) directly in code, with Domain Squads reviewing and approving changes.
Promote adoption of SRE best practices across development teams (integration patterns, monitoring, alerting, real-time tracing).
Provide cross-platform observability capabilities above and beyond what the Domain Squads provide. Investigate issues and incidents and propose/implement changes as deemed necessary.
Continuously review logs, metrics, and alerts to identify and/or implement continuous improvements.
Design non-functional test and continuously run them to ensure that we build quality up to and including production.

Job Benefits

Hybrid model: 3 days from the office, 2 days per week working from home/home office and lunch is on us when in the office!
26 vacation days per year
Language classes & professional courses
Free catering & snacks in the office
Private health insurance
An afternoon off on your birthday

If you're passionate about tech, innovation, and want to thrive in an environment that values collaboration and diversity, this role might be the perfect fit for you! Apply today and help us shape the future of the PayTech industry!

Requirements

The ideal candidate should have all the following requirements. However, we believe in self-learning and adaptation, so we can be flexible on certain requirements. What Is a MUST

Proactive attitude, always on the lookout for improvement opportunities.
Strong scripting skills (Python, Bash).
Experience in Cloud.
Knowledge of Grafana, Application Insights, OpenTelemetry, Prometheus.
5 Years of DBA experience in creating and maintaining DDBB in SQL Server (Mongo or PostgreSQL).
Fluent level of English, able to conduct technical meetings in English.

What Is Nice To Have

Experience with non-functional and production testing.
Analytical mindset, being able to connect the dots and establish cause and effect.
Experience with containers and container orchestration platforms (EKS/AKS).
Understanding of APIs and asynchronous distributed software architectures.
Working knowledge of AI-enabled tools like VS Code, Claude Code, etc.
Demonstrable experience with applying AI to Site Reliability Engineering.
Knowledge with process automation tools like N8N.
Working experience with chaos engineering.