Site Reliability Engineer (Operations)

thinkproject
Utrecht, Netherlands
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Utrecht, Netherlands

Tech stack

Amazon Web Services (AWS)
Azure
Bash
Databases
Linux
DevOps
Disaster Recovery
Monitoring of Systems
Information Technology Operations
Python
Windows Server
Powershell
Reliability Engineering
Prometheus
Software Vulnerability Management
Web Services
Scripting (Bash/Python/Go/Ruby)
Grafana
Information Technology
New Relic (SaaS)

Job description

  • Drive initiatives that improve service reliability, availability and operational quality.
  • Identify recurring operational challenges and implement sustainable improvements.
  • Contribute to the continuous evolution of operational standards, processes and best practices.

Monitoring & Incident Excellence

  • Enhance monitoring, alerting and observability capabilities across our platforms.
  • Analyze incidents and operational trends to identify opportunities for improvement.
  • Support Root Cause Analysis (RCA) activities and implement preventive measures to reduce recurring incidents.
  • Improve incident response processes and operational readiness.

Automation & Platform Efficiency

  • Design and implement automation solutions that reduce manual effort and improve operational efficiency.
  • Support platform scalability, resilience and operational maturity through engineering improvements.
  • Evaluate and introduce innovative technologies and approaches to optimize operations.

Security, Patching & Resilience

  • Drive improvements in vulnerability management, patching and infrastructure hardening.
  • Support backup, recovery and disaster recovery initiatives.

Requirements

  • Bachelor's degree in Computer Science, Information Technology or a related field, or equivalent practical experience.
  • Experience in Site Reliability Engineering, IT Operations, System Administration or DevOps environments.
  • Knowledge of cloud platforms such as Azure, AWS and/or GCP.
  • Experience with Linux and Windows server environments.
  • Familiarity with monitoring and observability platforms such as New Relic, Grafana, Prometheus or ELK.
  • Understanding of networking, databases and web services.
  • Experience troubleshooting incidents and operational issues in production environments.
  • Basic scripting or automation knowledge (e.g., Bash, PowerShell or Python).
  • Strong analytical and troubleshooting abilities.
  • Ability and willingness to work in a 24/7 shift-based operating model.
  • Strong communication skills and a collaborative team-oriented mindset.

Benefits & conditions

By combining information management expertise and in-depth knowledge of the building, infrastructure, and energy industries, Thinkproject empowers customers to efficiently deliver, operate, regenerate, and dispose of their built assets across their entire lifecycle through a Connected Data Ecosystem.

About the company

thinkproject was founded in 2000 in Munich, Germany. Since then, the company has grown into the leading provider for cross-enterprise collaboration and information management in Europe.

Global customers from the construction and engineering industries are served from thinkproject’s home base in Munich and via a range of subsidiaries across Europe.

thinkproject addresses today’s digitization challenges in construction and engineering by providing state-of-the-art software solutions as well as industry expert consulting and services.

Apply for this position