Platform Engineer
Hlx Life Sciences
Charing Cross, United Kingdom
12 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Compensation
£ 91KJob location
Charing Cross, United Kingdom
Tech stack
API
Artificial Intelligence
Backup Devices
Cloud Computing
Nvidia CUDA
Continuous Integration
Information Engineering
Linux
Disaster Recovery
Distributed Systems
Job Scheduling
Python
Reliability Engineering
Spark
GIT
Kubernetes
Data Management
Slurm
Machine Learning Operations
Terraform
Data Pipelines
Job description
AI Platform / ML Infrastructure Engineers
- Kubernetes-based compute platforms
- GPU scheduling, batch & distributed workloads
- Supporting ML training, inference, and experimentation at scale
HPC / GPU Engineers
- Job schedulers, MPI, multi-node workloads
- Hybrid cloud and on-prem compute
- Performance, reliability, and cost optimisation
Strong Data Engineers
- Large-scale data pipelines and data platforms
- Data reliability, orchestration, and observability
- Close collaboration with ML and research teams
What You'll Work On
- Designing and evolving Kubernetes-based compute platforms across hybrid and multi-cloud environments
- Building and operating GPU-enabled infrastructure for ML and scientific workloads
- Developing and maintaining core platform services, APIs, and internal tooling
- Improving CI/CD pipelines and Infrastructure-as-Code workflows
- Implementing monitoring, alerting, and reliability engineering practices
- Ensuring security, data protection, backup, and disaster recovery best practices
- Partnering closely with ML engineers, data scientists, and researchers to unblock compute and data challenges
Requirements
- Strong experience in one or more of:
- Platform / infrastructure engineering
- ML infrastructure or MLOps
- HPC or GPU compute
- Data engineering at scale
- Solid experience with Linux and cloud environments
- Hands-on work with Kubernetes or distributed systems
- Experience with Python (or similar) for automation or services
- Familiarity with CI/CD, Git-based workflows, and automation
- Strong problem-solving skills and a collaborative mindset
Bonus
- Terraform or other IaC tools
- Slurm, Kueue, Ray, Spark, or similar systems
- GPU tooling (CUDA, Nvidia operators, schedulers)
- Experience supporting ML training or data science teams