Infrastructure Engineer - London

SmartTrade

Charing Cross, United Kingdom

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Shift work

Languages

English

Experience level

Intermediate

Job location

Charing Cross, United Kingdom

Tech stack

Java

Link Aggregation (Ethernet)

Amazon Web Services (AWS)

Apache HTTP Server

Intelligent Platform Management Interface

Bash

Border Gateway Protocol

BIOS

Ubuntu (Operating System)

CentOS

Configuration Management

Continuous Integration

Dynamic Host Configuration Protocol

Linux

RAID

DNS

Elasticsearch

Trunking

Firmware

Apache Hypertext Transfer Protocol Server

Java Web Services

Python

Linux System Administration

Logical Volume Manager

MySQL

Networking Basics

Network Diagrams

Routing

Red Hat Enterprise Linux - RHEL

Ansible

Virtual Local Area Networks

Scripting (Bash/Python/Go/Ruby)

Computer Network Operations

Juniper

Gitlab

GIT

Containerization

Centreon

Kafka

Hardware Infrastructure

Lxc

Puppet

Terraform

Network Server

Dynatrace

Docker

Job description

The role blends Linux systems administration (Ubuntu), containerized compute (LXD/LXC, some Docker), networking, and datacenter operations.

You will partner with engineering, network, and security teams to ensure reliability, performance, and change control in a 24x7, market-facing environment.

This is a production-oriented role: you'll prepare, review and execute changes, troubleshoot live issues, execute maintenance windows, and continuously improve our platform through automation and rigorous documentation.

Our Environment

Servers: Dell, HPE, Supermicro.
Storage: LVM, software and hardware RAID (mdadm, MegaRAID, LSI, ...).
Containers: LXD/LXC (primary), some Docker.
Networking (day-2 ops): VLANs, LACP, ACLS, routing basics; vendors include Dell, Supermicro, Arista, Juniper, VYOS.
Applications & Data: MySQL, Elasticsearch, Kafka, Java, Apache HTTPD, ... Automation & laC: Git/GitLab, Ansible, Netbox, Chef, Terraform; scripting with Bash/Python.
Monitoring/Observability: Centreon, Dynatrace.

What You'll Do

Operate and improve Linux fleets (Ubuntu) in production.
Manage HPC baremetal and LXD/LXC container platforms
Provide level-3 incident response for infrastructure issues (systems, containers, network paths, storage), restoring service within SLAs and driving post-mortems.
Own Platforms datacenter operations in Slough: rack/stack, cabling, optics, power planning, servers installation, console/OOB, manage inventory in Netbox, RMA logistics, and vendor coordination (Equinix Smart Hands, carriers, OEMs).
Perform day-2 network operations on switches and firewalls (ACLS, VLANs, LAGS, routing basics), and collaborate closely with network engineering for changes
Automate with Ansible Chef for configuration management and Terraform for laC on AWS where applicable. Build reliable tooling for repeatable ops (config generation, pre-change checks, deployments, and validation).
Contribute to change management (runbooks, maintenance windows, rollback plans) and keep documentation current (network diagrams, inventories, SOPs).
Participate in a Follow-the-Sun operations model, coordinating with your EMEA/APAC peers., Standard business hours aligned to Central European Time with flexibility for maintenance windows.

Rotational Weekend work (Friday/Saturday/Sunday) for planned changes and datacenter work; comp day granted during the week.

Requirements

o 2-3+ years operating Linux (Ubuntu, CentOS, RedHat) in production environments.

o This position requires occasional on-call availability outside of standard business hours to respond to urgent or critical operational issues. Flexibility to be contacted outside regular working hours is required.

o Previous datacenter work exposure: rack/stack, structured cabling (fiber/copper), PDUs, console/OOB, vendor/Smart Hands coordination, and accurate inventory. If no prior experience, willingness to learn and work in such environments.

o Containers: exposure to LXC or Docker in a production environment and their inner workings.

o Server hardware & storage: LVM, software RAID, MegaRAID tooling, firmware/BIOS/BMC (iDRAC/ILO/IPMI), and hands-on diagnostics and replacements.

o Networking fundamentals for day-to-day ops: VLANs, LACP, trunking, ACLs, static routes, BGP, DNS/DHCP, link/MTU issues; ability to execute well-scoped changes on Dell/Arista/Juniper/VYOS under peer review.

o Automation & SCM: Bash/Python, Git/GitLab; experience with Chef or Ansible or Puppet in production.

o Clear runbook-style writing, disciplined change control, and calm, structured troubleshooting under time pressure.

Nice to have:

o Familiarity with Equinix processes (cross-connects, tickets, remote hands) and carrier coordination.

o Ops exposure to Netbox, MySQL, Elasticsearch, Kafka, Java services, Apache; ability to collaborate with app teams on infra-adjacent issues.

o Experience with Centreon and Dynatrace (or equivalent monitoring/observability stacks).

o Config management/laC depth (Ansible, Puppet, Terraform modules, Secret management), and CI pipelines in GitLab.

o Deeper networking (EVPN/VXLAN, BGP, multicast) and/or traffic engineering.

About the company

smartTrade Technologies is a software publisher specializing in the trading and finance sector. Its clients primarily include investment banks, stock exchanges, brokers, and pension funds. smartTrade enables real-time computerized management of financial flows among these different stakeholders. Joining smartTrade means becoming a part of an innovative and international company with offices in Aix-en-Provence, London, Geneva, New York, Toronto, and Tokyo. Skill development and career progression are top priorities at smartTrade, offering employees numerous opportunities for learning, advancement, and mobility. Sports and their values of teamwork, performance, and dynamism are integral to the company's culture. Additionally, smartTrade is highly committed to continuously supporting various charitable and environmental initiatives.