Linux HPC Systems Administrator/Engineer
Xcede
Stevenage, United Kingdom
4 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
£ 75KJob location
Stevenage, United Kingdom
Tech stack
Computing Platforms
Cluster Analysis
Linux
DevOps
Web Servers
Red Hat Enterprise Linux - RHEL
Scientific Computating
Software Engineering
High Performance Computing
Performance Monitor
Slurm
ServiceNow
Job description
- Administer, configure, and support Red Hat Enterprise Linux (RHEL 8/9) environments, focusing on stability, performance, and security
- Provide hands-on onsite support for high-end workstations, resolving hardware and software issues
- Support HPC environments, including clustering and workload management (Slurm)
- Monitor and troubleshoot performance issues, including GPU and networking impacts
- Use ServiceNow for incident, change, and ticket management while driving process improvements
- Manage SSL certificates and assist with web server configuration as required
- Collaborate with stakeholders and vendors, communicating technical topics clearly and building strong working relationships
Technologies:
- Hardware
- Support
- Linux
- RHEL
- Security
- ServiceNow
- Web
- DevOps, We are looking for a Senior Linux HPC Systems Administrator/Engineer to join our team in Stevenage, Hertfordshire. This hybrid role requires onsite presence three days a week, allowing for hands-on hardware support and collaboration in a Linux-based high-performance/scientific computing environment. You will be working closely with technical and scientific users to maintain critical infrastructure and high-end workstations. We offer a supportive work environment and opportunities for personal and professional development, all while playing a crucial role in our cutting-edge projects.
Requirements
- Minimum 10 years enterprise IT experience
- Strong hands-on administration of Red Hat Enterprise Linux (RHEL 8 & 9)
- Experience supporting scientific users, applications, and/or research computing environments
- HPC exposure: Slurm, clusters, and general HPC operations/support
- Strong troubleshooting skills across Linux, hardware, and applications
- Confident stakeholder communication, particularly for onsite support
- Nice to have: ServiceNow experience
- Broader knowledge of networking, performance monitoring tooling, and GPU technologies