Senior Site Reliability Engineer

REWE Group
Vienna, Austria
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Experience level
Senior

Job location

Remote
Vienna, Austria

Tech stack

Java
Microsoft Windows
JIRA
Bash
Cloud Computing
Computer Programming
Continuous Integration
Linux
DevOps
DNS
Monitoring of Systems
Hypertext Transfer Protocols (HTTP)
Identity and Access Management
Python
Reliability Engineering
Ansible
Prometheus
Subversion
TCP/IP
Transport Layer Security
Fluentd
Saltstack
Grafana
Reliability of Systems
Firewalls (Computer Science)
Gitlab
GIT
Containerization
Kubernetes
Information Technology
Terraform
Splunk
Software Version Control
Docker
ELK
ServiceNow
Go

Job description

  • Design, implement, and maintain highly reliable and scalable infrastructure and services using cloud platforms (e.g. GCP).
  • Automate repetitive tasks using tools such as Terraform, Ansible and SaltStack.
  • Collaborate with development and operations teams to ensure smooth deployment and operation of services using CI/CD pipelines (e.g. Gitlab).
  • Establish and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure system reliability using monitoring tools like Prometheus and Grafana.
  • Perform capacity planning and optimization to handle growth and scale.
  • Lead incident management and post-mortem processes to ensure continuous improvement. In addition to conducting root analysis of system failures.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • 5+ years of experience as an SRE, DevOps Engineer, or similar role.
  • Strong understanding of cloud infrastructure (specifically GCP) and
  • containerization technologies (Docker, Kubernetes).
  • Proficiency in scripting and programming languages (Java, Python, Go, Bash).
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK Stack, Fluentd, Splunk).
  • Solid knowledge of networking (DNS, TCP/IP, HTTP), security best practices (SSL/TLS, firewalls, IAM),
  • and system administration (Linux, Windows).
  • Experience with Incident Management (Jira, ServiceNow), version control systems (Git, SVN) and CI/CD.

Benefits & conditions

  • A family-friendly company culture with flexible working hours and remote working options available according to your individual needs
  • Numerous training and further development opportunities within the Group (5% of working time for self-organized training and education)
  • A lunch allowance
  • Staff shopping and travel discounts
  • Extensive workplace wellness programme including fitness classes, massages, etc.
  • An industry-standard, attractive and performance based annual gross salary starting at 60.000 Euro (on a full-time basis) with the possibility of higher pay according to experience and qualifications

No matter where you are in your career, we have a path for you. Whether you're looking for your first job, advancement in your field, or a new career shift. We're proud to employ great people who are passionate about their jobs. But they're all different. No matter who you are, what you need and where you're going, REWE Group can be a part of it. Apply now!

About the company

As the IT of the REWE Group Austria, we work together with our more than 700 employees to develop innovative IT products and services for all our corporate divisions in Austria and abroad, setting the tone for modern trade. We are looking for a highly skilled and experienced Senior SRE (Site Reliability Engineer) to join our team. The ideal candidate will ensure the reliability, availability, and performance of our critical infrastructure and services. This role involves collaborating with cross-functional teams to build and maintain scalable and efficient systems, implement automation, and drive improvements in system reliability.

Apply for this position