HPC Support Engineer - TS/SCI Required
Role details
Job location
Tech stack
Job description
Phoenix is seeking an HPC Support Engineer to support users executing computational workloads within advanced Linux-based High Performance Computing (HPC) environments. This role is essential to ensuring efficient, reliable execution of distributed workloads, scientific simulations, and GPU-accelerated processing.
You will work directly with users and systems in a cluster-scale environment, helping optimize job performance, troubleshoot issues, and promote HPC best practices.
What You'll Do
- Support execution of distributed compute workloads on HPC clusters
- Troubleshoot job failures and performance issues
- Assist users with scheduler job submission scripts (e.g., Slurm, PBS)
- Identify and resolve performance bottlenecks across compute workloads
- Support GPU-enabled workloads and CUDA-based processing
- Guide users on efficient cluster utilization and HPC best practices
- Assist with application execution, compilation, and runtime issues
- Develop and maintain automation scripts and tooling, Our technical competencies include Big Data analytics (batch and streaming), Cloud Computing infrastructure, multi-INT visualization, and enterprise architectures. We support operational missions (All-Source, Financial, CND) and serve as Product Owners for our open-source research initiatives.
Requirements
- Active TS/SCI clearance
- Ability to work onsite in Charlottesville, VA
- 5+ years of experience in Linux environments supporting HPC or distributed compute workloads
- Experience executing or troubleshooting workloads using:
- Slurm
- PBS / PBS Pro
- Torque or similar schedulers
- Strong command-line Linux experience (RHEL preferred)
- Experience with scripting or automation (Bash, Python, or similar)
- Ability to obtain DoD 8140 (8570) IAT Level II certification, * Experience supporting HPC cluster environments
- Experience with MPI, OpenMP, or parallel computing frameworks
- Experience supporting GPU workloads and CUDA environments
- Familiarity with scientific or engineering applications in HPC environments
- Experience with C/C++ or Fortran and compiler toolchains (GCC, Intel, LLVM)
- Experience troubleshooting application build or runtime issues
- Experience supporting research labs, university HPC, or defense environments
Technical Environment
You'll work in a cutting-edge environment that includes:
- Multi-node Linux HPC clusters
- Workload schedulers (Slurm, PBS)
- Distributed computing frameworks (MPI, OpenMP)
- GPU-enabled compute (CUDA)
- High-performance networking (RDMA, InfiniBand), * Scheduler expertise (Slurm, PBS, etc.)
- Linux-based compute environment experience
- Distributed workload performance tuning
- Automation and scripting experience
Benefits & conditions
Paid training, Referral program, Health insurance, 401(k) matching, Paid time off, Vision insurance, Dental insurance, Life insurance, Medical, Dental, Vision Insurance - 100% Company Paid Premiums
STD, LTD, and Life Insurance - 100% Company paid
401K - Automatic 10% company contribution; no matching required
PTO - 4 weeks/year
Holidays - 11 paid/year
Birthdays off with pay
Referral Bonuses - Upfront AND Annually Recurring
Open Source Bonuses - Contribute to our Github projects
Professional Development - Paid training, Certifications, and Enrichment