Platform Engineer

Alfa Technology Recruitment Ltd
Charing Cross, United Kingdom
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote
Charing Cross, United Kingdom

Tech stack

API
Border Gateway Protocol
Code Review
Continuous Integration
Linux
DevOps
Distributed Data Store
Distributed Systems
Python
Remote Direct Memory Access
Prometheus
Software Engineering
Web Services
AI Infrastructure
Ceph
Cloud Platform System
High Performance Computing
System Availability
Grafana
Software Application Programming
Build Management
Kubernetes
Bare Metal
Build Tools
Hardware Infrastructure
Terraform

Job description

This is a hands on engineering role for someone who can write strong Python, work deeply with Kubernetes, design and build platform applications, and operate close to bare metal infrastructure.

You will help build the systems that make GPU compute easier to provision, operate, secure and scale across AI infrastructure environments.

This is not a generic DevOps role. We are not looking for someone who has only maintained pipelines, written Terraform or managed cloud services. We need someone who can build real platform software and understands the infrastructure it runs on. What you will do

Design and build platform applications, APIs and services

Write production grade Python for infrastructure and platform use cases

Work with Kubernetes to build scalable platform capabilities

Design and build Kubernetes operators and controllers across compute, storage and networking

Build tooling that improves how bare metal and GPU infrastructure is provisioned, operated and monitored

Translate operational pain points into scalable platform features

Improve platform reliability, observability and performance

Work across Linux, networking, storage and distributed systems

Requirements

Strong Python engineering experience

Strong hands on Kubernetes experience

Experience designing and building applications, APIs, services or internal platform tooling

Bare metal infrastructure experience

Strong Linux systems experience

Good understanding of networking, storage and distributed systems

Experience building production grade systems with proper testing, CI/CD, code reviews and clean engineering standards

A practical engineering mindset and the ability to solve real infrastructure problems through software Preferred experience

Experience building Kubernetes operators, CRDs or controllers

Exposure to GPU infrastructure, HPC or high performance compute

Experience with Go or Rust

Knowledge of confidential computing, including TEE, SEV, TDX or CoCo

Experience with Ceph or distributed storage systems

Familiarity with Prometheus, Grafana or OpenTelemetry

Experience with BGP, RDMA or high performance networking

Exposure to NVIDIA GPU infrastructure or bare metal cloud environments Why this role matters

AI infrastructure is constrained by the ability to deliver reliable compute at scale. This role sits in the platform layer that connects software engineering with real infrastructure.

You will help build systems that run close to the metal, across Kubernetes, Linux, networking, storage and GPU compute.

Apply for this position