Site Reliability Engineer - Data Platform

IMC

Amsterdam, Netherlands

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Amsterdam, Netherlands

Tech stack

Java

Airflow

Bash

Data as a Services

Data Infrastructure

Software Debugging

Linux

Distributed Data Store

Distributed Systems

Hadoop

Hadoop Distributed File System

Python

Pcap

Octopus Deploy

Performance Tuning

Reliability Engineering

Ansible

Prometheus

SQL Databases

Scripting (Bash/Python/Go/Ruby)

Grafana

Spark

Containerization

Kubernetes

Kafka

Data Management

Puppet

Pagerduty

Job description

IMC operates on the cutting-edge use of technology to create a competitive edge over the competition. We also grow quick and have plenty of complex technical challenges. We're looking for an experienced SRE with a strong linux, automation, and distributed systems background, who can help us standardize deployments, elevate observability, and scale our data platform and other critical data services. You will join our Data Platform team, part of our local data team that support engineering teams, traders and other users with all their data needs. They're the team responsible for the foundational platform that our data frameworks and tooling is built on top of. That includes monitoring and alerting, scalability and supporting standardised deployments.

Your Core Responsibilities:

As an SRE within IMC you will join a small sub-team that takes a central role in all the data needs and you'll be focusing your energy towards:

Design, implement and manage our data platforms.
Improve observability with Prometheus, Grafana and other tools.
Develop automation processes that allow for scalability and improved reliability of internal tools and systems, supporting an automation-first culture across our data infrastructure.
Contribute to long-term architectural improvements - not just fixing issues, but preventing them.
Support critical services like HDFS, Kafka and Dremio.

Requirements

Strong experience with distributed data platforms (e.g. Kafka, Hadoop, Spark); including full installation of those platforms as well as debugging and performance tuning.
Strong experience deploying, configuring, orchestrating and operating software on Linux and Kubernetes.
Strong experience with automation; including reading and writing of Python.
Comfort reading Java source code, tuning and debugging running JVMs.
Comfort with infrastructure as code (Ansible preferred).
Comfort with reading, writing and tuning SQL queries run on various query engines.
A proactive mindset - you're not just fixing issues but preventing them.
Comfortable working across teams, with minimal oversight.

Our Tech Stack:

Infrastructure & Observability: Linux, Prometheus, Grafana, AlertManager, OpsGenie
Data Tools: Hadoop (HDFS), Kafka, Spark, Airflow, SQL
Automation: Ansible, Puppet, ArgoCD, Kustomize
Containerization & Orchestration: Kubernetes
Scripting/Automation: Python, Bash
Others: Dremio, PCAP infrastructure

About the company

IMC is a global trading firm powered by a cutting-edge research environment and a world-class technology backbone. Since 1989, we've been a stabilizing force in financial markets, providing essential liquidity upon which market participants depend. Across our offices in the US, Europe, Asia Pacific, and India, our talented quant researchers, engineers, traders, and business operations professionals are united by our uniquely collaborative, high-performance culture, and our commitment to giving back. From entering dynamic new markets to embracing disruptive technologies, and from developing an innovative research environment to diversifying our trading strategies, we dare to continuously innovate and collaborate to succeed.