Senior Site Reliability Engineer

ICEO - Venture Builder
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
€ 90K

Job location

Remote

Tech stack

Secure Shell (SSH)
Java
Amazon Web Services (AWS)
Apache HTTP Server
Confluence
JIRA
Azure
Bash
C++
Ubuntu (Operating System)
Continuous Integration
Debian Linux
Linux
DevOps
Document Management Systems
DNS
Elasticsearch
Fault Tolerance
HTTP Secure
Java Virtual Machine (JVM)
Python
PostgreSQL
Linux System Administration
Nginx
Node.js
OpenVPN
Redis
Reliability Engineering
Prometheus
Systems Architecture
TCP/IP
Tripwire
Wide Area Networks
Data Logging
Load Balancing
Okta
Fluentd
Istio
Grafana
Reliability of Systems
Firewalls (Computer Science)
Containerization
Kubernetes
Kafka
Bitbucket
Kibana
REST
Terraform
Software Version Control
Google Meet
Docker
Go

Job description

  • Own and lead the definition and execution of the SRE vision and strategy, ensuring alignment with business objectives and engineering priorities.
  • Architect, maintain and develop infrastructure within GCP and GKE, focusing on performance, security, availability and reliability.
  • Develop automated solutions for system reliability, capacity planning and incident response to minimize manual intervention.
  • Collaborate with engineering and product teams to design and implement highly available, fault-tolerant systems.
  • Own and deliver Service Level Objectives, Service Level Indicators and error budgets to enhance system reliability.
  • Create and maintain documentation for implemented solutions.
  • Mentor engineering teams on SRE principles, DevOps culture and best practices.
  • Stay updated on industry trends, evaluating new tools and methodologies to improve system reliability.
  • Balance security, performance and flexibility in all decisions.
  • Participate in daily stand-ups, planning and other team meetings., * Communication: Slack, Google Meet
  • Work management: Jira
  • Documentation: Confluence
  • Repository: Bitbucket
  • Automation & IaC: Bash, Python, Go, Terraform
  • Observability: Prometheus, Grafana, Jaeger, Tempo, Loki
  • CI/CD: Bitbucket Pipelines, ArgoCD
  • Containerization & orchestration: Docker, Kubernetes, Helm
  • Security tooling: SOPS, Okta, TFsec, Trivy, Istio
  • Stateful: PostgreSQL, TimescaleDB, Redis, Kafka, Elasticsearch
  • HTTP: Nginx & Ingress-Nginx

Recruitment process

  • Stage 1 - Screening with Talent Acquisition Partner ( 45 min).
  • Stage 2 - Technical interview with Senior Developer ( 1 h, system architecture focus).
  • Stage 3 - Interview with Lead of DevOps - hands-on Docker/Kubernetes/Linux scenarios ( 1 h).
  • Stage 4 - Final interview with Head of Technology (30 min).
  • Background check after offer extension.

Requirements

  • 5+ years in a DevOps, SRE or similar role, working on a product with long-term platform maintenance.
  • 10+ years of experience in technology.
  • Independent platform management experience with autonomous decision making.
  • Proficiency in at least one programming language (Python, Go, C++ or Java).
  • Extensive experience with JVM, Node.js and related application maintenance.
  • Advanced Linux administration (Debian/Ubuntu).
  • Strong networking knowledge (LAN/WAN, firewall, proxy, load balancers, HTTP(S), DNS, SSH, TCP/IP, REST).
  • Hands-on experience with observability tools (Prometheus, Grafana, OpenTelemetry, etc.).
  • Knowledge of Kafka, Redis, Nginx and Docker.
  • Experience with CI/CD and version control systems.
  • Expertise in Kubernetes, Helm and Helm charts.
  • Public cloud experience (GCP, AWS or Azure) including redundancy and disaster-recovery design.
  • Design, implementation and maintenance of scalable, high-performance infrastructure (HPA, KEDA, affinity rules).
  • Proficient in written and spoken English (B2 or higher).

Nice to have

  • Financial domain experience.
  • Interest in finance, trading and crypto.
  • Experience with Argo CD and Argo Rollouts.
  • Knowledge of Apache HTTP Server, OpenVPN.
  • Logging tools experience (Kibana, FluentD, Elasticsearch, Loki).
  • Large PostgreSQL database configuration and optimization.
  • SSO and Okta expertise.
  • Self-motivated and able to work independently with minimal supervision.

Benefits & conditions

What we offer

  • Remote-first company - work from anywhere.
  • Flexible working hours with core time 11 am-3 pm CET.
  • 38 days paid vacation plus 14 days paid sick leave.
  • Autonomy to make decisions and explore new ideas.
  • B2B salary of €75,000-90,000 per year plus on-call compensation.

Apply for this position