HPC Architect
Role details
Job location
Tech stack
Job description
You will take ownership of a high performance computing roadmap that supports critical engineering development across the defence sector. This role allows you to design and influence large scale compute and storage architectures while ensuring long term scalability for complex simulation workloads. You will drive the transition towards software defined data centres and advocate for open source technologies within a secure environment., What is in it for youYour package includes a combined pension contribution of up to 14% and a performance related bonus of up to 21%. Your role
- Lead the architectural strategy and roadmap for large scale high performance computing environments.
- Design and implement scalable compute and storage solutions to meet the demands of fast paced engineering projects.
- Integrate high performance networking solutions to eliminate bottlenecks in high throughput environments.
- Advocate for and implement open source technologies and software defined data centre methodologies.
- Manage complex workload orchestration using tools such as SLURM or Kubernetes to ensure system reproducibility.
Requirements
- British citizenship is essential for this role.
- Ability to obtain and maintain Developed Vetting security clearance.
- Extensive experience architecting large scale CPU and GPU clusters.
- Proven background in parallel file systems and tiered storage design.
- Understanding of high throughput networking including InfiniBand and RDMA., * Strong experience in the design and delivery of high performance computing infrastructure at scale.
- Deep technical knowledge of workload orchestration and parallel processing.
- Background in software defined data centre architecture and implementation.
- Capability to manage multi national stakeholder requirements and vendor relationships.
- Knowledge of MPI and CUDA libraries within an engineering or scientific context.
- Professional approach to balancing technical performance with cost and data retention requirements.