Site Reliability Engineer - Fixed Term Contract

CBOE

Charing Cross, United Kingdom

1 month ago

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Charing Cross, United Kingdom

Tech stack

Algorithmic Trading

Amazon Web Services (AWS)

Data analysis

Systems Engineering

C++

Configuration Management

Computer Engineering

Linux

Disaster Recovery

Network Interface Controllers

Python

Linux System Administration

Multicasting

Network administration

Performance Tuning

Reliability Engineering

Shell Script

Software Engineering

SQL Databases

Data Streaming

TCP/IP

System Availability

Information Technology

Low Latency

Bare Metal

Job description

The Site Reliability Engineer is a role served by experienced technologists with a diverse set of skills ranging from software development to systems, network, application, and/or database management. The Cboe Site Reliability Engineering team is a highly skilled unit responsible for platform engineering, configuration management, implementation, capacity planning, performance tuning, analysis, troubleshooting, reporting, and process automation. The Site Reliability Engineer provides technical support to Cboe Trade Desk and Operations Support Center staff as needed. The Site Reliability Engineer also works closely with Software Engineering, Systems Engineering, and Network Engineering teams to troubleshoot complex issues and coordinate and support platform configuration updates. A Site Reliability Engineer must be able to work independently with little to no direct supervision in performing their duties., * Provide technical support and operational oversight to ensure resiliency and high availability of critical trading platforms across production, disaster recovery, and certification environments.

Monitor systems, troubleshoot issues end-to-end, perform root cause analysis, and drive long-term stability improvements.
Analyze performance of real-time trading systems, investigate software defects, and support build and deployment activities.
Operate and maintain low-latency bare-metal infrastructure, including hardware health, Linux OS tuning, and kernel-bypass networking stacks (Solarflare/Onload).
Apply strong understanding of multicast networking, NIC configuration, and market-data feed delivery topology to triage incidents and coordinate with network engineering.
Work hands-on within physical low-latency environments where bare-metal expertise is essential.
Develop and enhance operational reporting, analyze technical datasets (order entry, market data, matching engine logs), and execute SQL-based investigations to support internal stakeholders.
Manage configuration of trading platforms, evaluate change impacts, and support feature rollouts aligned with business requirements.
Drive automation and toil reduction through Python-based tooling and manage operational batch workflows including scheduling, dependency handling, and failure recovery.
Contribute to exchange capacity planning, support cross-team scaling initiatives, and participate in regular capacity reviews.
Support weekend testing activities such as capacity testing, stress testing and disaster recovery exercises, and participate in a 24×7 on-call rotation.

Requirements

Do you have experience in TCP/IP?, Do you have a Bachelor's degree?, * Bachelor's or higher degree in Computer Science, Computer Engineering, Software Engineering, or a related discipline

5+ years of experience as a systems administrator, software developer, SRE, or within a technical operations role in a financial or mission-critical infrastructure environment
3+ years of hands-on Linux administration and troubleshooting in bare-metal, low-latency infrastructure environments
2+ years of proficiency in Python, SQL, and Linux shell scripting for operational automation and data analysis
Working knowledge of network administration concepts including TCP/IP, multicast, NIC tuning, and low-latency networking principles (preferred)
Vigorous desire to learn about ultra-low latency financial platforms
Previous experience supporting ultra-low latency financial platforms (preferred)
Ability to read and evaluate C++ (preferred)
Network Administration experience (preferred)
Exposure to cloud environments such as AWS and hybrid infrastructure models is advantageous but not required.