HPC Linux Engineer

The One Group

Cambridge, United Kingdom

10 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Compensation

£ 65K

Job location

Remote

Cambridge, United Kingdom

Tech stack

Computing Platforms

Software Documentation

Linux

Distributed Systems

Red Hat Enterprise Linux - RHEL

Operational Systems

Job description

We're looking for a highly capable HPC Linux Engineer to join a small, specialist IT function supporting a demanding compute-led environment. This is a hands-on role for someone who enjoys owning infrastructure end-to-end and improving performance, reliability, and scalability across complex Linux-based systems.

They offer a very strong benefits package including 28 days of annual leave, bonus scheme, equity, private health insurance, as well as a number of smaller benefits.

Working closely with the IT Manager and engineering teams, you'll play a key role in the ongoing development of a high-performance computing (HPC) platform, ensuring it remains secure, efficient, and fit for future growth. This role would suit an experienced Linux Systems Administrator who enjoys problem-solving in technically challenging environments and looking for some additional responsibility.

Key areas of responsibility include:

Operating and enhancing a high-performance Linux compute platform, covering servers, storage, and associated services
Tracking utilisation, capacity, and performance, and taking action to prevent issues before they impact users
Introducing and refining workload management and scheduling solutions within the HPC estate
Reducing manual effort through effective use of automation
Contributing to infrastructure roadmaps, upgrades, and scaling decisions
Creating and maintaining technical standards, runbooks, and system documentation
Acting as an escalation point for complex platform-related issues
Maintaining a strong security posture and ensuring systems align with internal policies and external requirements

Requirements

Do you have experience in System administration?, * Several years' experience supporting Linux infrastructure in compute-intensive or highly available environments

Practical exposure to HPC platforms, including workload schedulers and distributed systems
Strong knowledge of Red Hat-derived operating systems
A proven track record of automating operational tasks using scripting or infrastructure-as-code tools
A solid understanding of networking concepts as they apply to server and cluster environments
Experience implementing or maintaining backup, recovery, and data protection solutions
The confidence to communicate clearly with both technical and non-technical stakeholders