Hybrid Hardware & Software Support Engineer - HPC

Aeroficial Intelligence
Reading, United Kingdom
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Reading, United Kingdom

Tech stack

Artificial Intelligence
Bash
Configuration Management
Linux
General Parallel File Systems
Monitoring of Systems
Icinga
InfiniBand
Python
Kernel-Based Virtual Machine
Linux System Administration
Routing
OpenStack
Ansible
Prometheus
Subsystems
TCP/IP
Virtualization Technology
Ceph
Scripting (Bash/Python/Go/Ruby)
High Performance Computing
Grafana
GIT
Kubernetes
Information Technology
Slurm
Puppet
Docker

Job description

Primarily on-site at a customer facility near Reading, Berkshire, with occasional support for additional HPC installations across Europe., Bull's High-Performance Computing (HPC), Artificial Intelligence & Quantum Business Unit is seeking a Hybrid Hardware & Software Support Engineer to join our HPC Services team. This is a highly visible, customer-facing operational role supporting advanced HPC infrastructures in the UK. You will work across computing, storage, and networking layers, ensuring the deployment, stability, and performance of large-scale Linux-based systems. While prior HPC experience is an advantage, it is not mandatory - strong Linux and infrastructure engineers eager to grow into HPC & AI are encouraged to apply., Deployment & System Bring-Up

  • Install, configure, and integrate HPC cluster components (compute, storage, networking).
  • Perform system installation, initial configuration, and operational readiness checks.
  • Apply patches, updates, and conduct routine maintenance activities.

Hybrid Hardware & Software Support

  • Provide Level 1 and Level 2 operational support for HPC environments.

  • Diagnose and resolve issues involving:

  • Linux operating systems

  • Enterprise server hardware

  • High-speed interconnects

  • Storage subsystems

  • Conduct root cause analysis and implement corrective actions.

  • Escalate appropriately within the global support organisation when needed.

Operations & Incident Handling

  • Monitor system health and respond to incidents proactively.
  • Perform troubleshooting in secure, mission-critical environments.
  • Maintain detailed and accurate documentation of incidents and resolutions.

Customer Interface

  • Act as the primary technical contact on-site.
  • Communicate effectively regarding incidents, planned maintenance, and system status.
  • Build trusted relationships with customer technical stakeholders.
  • Represent Bull professionally in sensitive and high-profile environments.

Requirements

  • Strong Linux expertise (RedHat and/or Debian-based environments)
  • Solid understanding of enterprise server hardware (CPU, memory, storage, diagnostics)
  • Scripting skills in Bash and/or Python
  • Strong networking fundamentals (TCP/IP, routing, switching, security basics)
  • Hands-on experience with infrastructure deployment, configuration, and maintenance
  • Excellent troubleshooting and analytical abilities
  • Proactive mindset and ability to work independently

Desirable Skills & Experience Valuable, but not mandatory:

  • Experience with HPC clusters
  • High-speed networking (40/100GbE, InfiniBand)
  • Virtualisation technologies (KVM, OpenStack)
  • Storage systems (Ceph, SAN/NAS)
  • Parallel filesystems (Lustre, GPFS, BeeGFS)
  • Containers (Docker, Podman, Kubernetes)
  • Configuration management (Ansible, Puppet)
  • Monitoring and observability tools (Prometheus, Grafana, Icinga)
  • Workload managers (Slurm, PBS Pro)
  • Git version control, * Is hands-on, operationally focused, and detail oriented
  • Thrives in secure, mission-critical environments
  • Approaches troubleshooting methodically, even under pressure
  • Communicates clearly with both technical and non-technical stakeholders
  • Takes full ownership of incidents through to resolution
  • Is motivated to learn continuously and expand their technical expertise

Education & Experience Option 1:

  • Degree in Computer Science, Engineering, or related field + at least 2 years of relevant experience

Option 2:

  • 5+ years of relevant industry experience

Strong early-career candidates with solid technical foundations will also be considered.

Benefits & conditions

  • Working on advanced HPC and digital infrastructure projects
  • Continuous learning and technical skill development
  • Career growth within a global technology organisation
  • Participation in internal initiatives and community-focused activities.

What happens next? Your application will be reviewed (1-2 business days) Short-listed candidates will be contacted for a discussion with HR Interview with management team Feedback (1-10 business days after the interview). Join us! Here, your ideas, your curiosity and your technical excellence directly shape the next era of advanced computing - unlocking enterprise value, accelerating scientific progress and driving positive impact for society.

Apply for this position