Site Reliability Engineer - Data Platform

IMC
Amsterdam, Netherlands
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Amsterdam, Netherlands

Tech stack

Java
Airflow
Bash
Data as a Services
Data Infrastructure
Software Debugging
Linux
Distributed Data Store
Distributed Systems
Hadoop
Hadoop Distributed File System
Python
Pcap
Octopus Deploy
Performance Tuning
Reliability Engineering
Ansible
Prometheus
SQL Databases
Scripting (Bash/Python/Go/Ruby)
Grafana
Spark
Containerization
Kubernetes
Kafka
Data Management
Puppet
Pagerduty

Job description

IMC operates on the cutting-edge use of technology to create a competitive edge over the competition. We also grow quick and have plenty of complex technical challenges. We're looking for an experienced SRE with a strong linux, automation, and distributed systems background, who can help us standardize deployments, elevate observability, and scale our data platform and other critical data services. You will join our Data Platform team, part of our local data team that support engineering teams, traders and other users with all their data needs. They're the team responsible for the foundational platform that our data frameworks and tooling is built on top of. That includes monitoring and alerting, scalability and supporting standardised deployments.

Your Core Responsibilities:

As an SRE within IMC you will join a small sub-team that takes a central role in all the data needs and you'll be focusing your energy towards:

  • Design, implement and manage our data platforms.
  • Improve observability with Prometheus, Grafana and other tools.
  • Develop automation processes that allow for scalability and improved reliability of internal tools and systems, supporting an automation-first culture across our data infrastructure.
  • Contribute to long-term architectural improvements - not just fixing issues, but preventing them.
  • Support critical services like HDFS, Kafka and Dremio.

Requirements

  • Strong experience with distributed data platforms (e.g. Kafka, Hadoop, Spark); including full installation of those platforms as well as debugging and performance tuning.
  • Strong experience deploying, configuring, orchestrating and operating software on Linux and Kubernetes.
  • Strong experience with automation; including reading and writing of Python.
  • Comfort reading Java source code, tuning and debugging running JVMs.
  • Comfort with infrastructure as code (Ansible preferred).
  • Comfort with reading, writing and tuning SQL queries run on various query engines.
  • A proactive mindset - you're not just fixing issues but preventing them.
  • Comfortable working across teams, with minimal oversight.

Our Tech Stack:

  • Infrastructure & Observability: Linux, Prometheus, Grafana, AlertManager, OpsGenie
  • Data Tools: Hadoop (HDFS), Kafka, Spark, Airflow, SQL
  • Automation: Ansible, Puppet, ArgoCD, Kustomize
  • Containerization & Orchestration: Kubernetes
  • Scripting/Automation: Python, Bash
  • Others: Dremio, PCAP infrastructure

About the company

IMC is a global trading firm powered by a cutting-edge research environment and a world-class technology backbone. Since 1989, we've been a stabilizing force in financial markets, providing essential liquidity upon which market participants depend. Across our offices in the US, Europe, Asia Pacific, and India, our talented quant researchers, engineers, traders, and business operations professionals are united by our uniquely collaborative, high-performance culture, and our commitment to giving back. From entering dynamic new markets to embracing disruptive technologies, and from developing an innovative research environment to diversifying our trading strategies, we dare to continuously innovate and collaborate to succeed.

Apply for this position