HPC Support Engineer - TS/SCI Required

THE PHOENIX
Charlottesville, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Charlottesville, United States of America

Tech stack

Bash
Big Data
C++
Command-Line Interface
Nvidia CUDA
Linux
Distributed Computing Environment
Fortran
InfiniBand
Python
Linux System Administration
Node.js
OpenMP
Open Source Technology
Performance Tuning
Remote Direct Memory Access
Red Hat Enterprise Linux - RHEL
Data Streaming
Toolchain
Scripting (Bash/Python/Go/Ruby)
Cloud Platform System
High Performance Computing
Software Troubleshooting
Parallel Computation
Slurm
Engineering Base

Job description

Phoenix is seeking an HPC Support Engineer to support users executing computational workloads within advanced Linux-based High Performance Computing (HPC) environments. This role is essential to ensuring efficient, reliable execution of distributed workloads, scientific simulations, and GPU-accelerated processing.

You will work directly with users and systems in a cluster-scale environment, helping optimize job performance, troubleshoot issues, and promote HPC best practices.

What You'll Do

  • Support execution of distributed compute workloads on HPC clusters
  • Troubleshoot job failures and performance issues
  • Assist users with scheduler job submission scripts (e.g., Slurm, PBS)
  • Identify and resolve performance bottlenecks across compute workloads
  • Support GPU-enabled workloads and CUDA-based processing
  • Guide users on efficient cluster utilization and HPC best practices
  • Assist with application execution, compilation, and runtime issues
  • Develop and maintain automation scripts and tooling, Our technical competencies include Big Data analytics (batch and streaming), Cloud Computing infrastructure, multi-INT visualization, and enterprise architectures. We support operational missions (All-Source, Financial, CND) and serve as Product Owners for our open-source research initiatives.

Requirements

  • Active TS/SCI clearance
  • Ability to work onsite in Charlottesville, VA
  • 5+ years of experience in Linux environments supporting HPC or distributed compute workloads
  • Experience executing or troubleshooting workloads using:
  • Slurm
  • PBS / PBS Pro
  • Torque or similar schedulers
  • Strong command-line Linux experience (RHEL preferred)
  • Experience with scripting or automation (Bash, Python, or similar)
  • Ability to obtain DoD 8140 (8570) IAT Level II certification, * Experience supporting HPC cluster environments
  • Experience with MPI, OpenMP, or parallel computing frameworks
  • Experience supporting GPU workloads and CUDA environments
  • Familiarity with scientific or engineering applications in HPC environments
  • Experience with C/C++ or Fortran and compiler toolchains (GCC, Intel, LLVM)
  • Experience troubleshooting application build or runtime issues
  • Experience supporting research labs, university HPC, or defense environments

Technical Environment

You'll work in a cutting-edge environment that includes:

  • Multi-node Linux HPC clusters
  • Workload schedulers (Slurm, PBS)
  • Distributed computing frameworks (MPI, OpenMP)
  • GPU-enabled compute (CUDA)
  • High-performance networking (RDMA, InfiniBand), * Scheduler expertise (Slurm, PBS, etc.)
  • Linux-based compute environment experience
  • Distributed workload performance tuning
  • Automation and scripting experience

Benefits & conditions

Paid training, Referral program, Health insurance, 401(k) matching, Paid time off, Vision insurance, Dental insurance, Life insurance, Medical, Dental, Vision Insurance - 100% Company Paid Premiums

STD, LTD, and Life Insurance - 100% Company paid

401K - Automatic 10% company contribution; no matching required

PTO - 4 weeks/year

Holidays - 11 paid/year

Birthdays off with pay

Referral Bonuses - Upfront AND Annually Recurring

Open Source Bonuses - Contribute to our Github projects

Professional Development - Paid training, Certifications, and Enrichment

About the company

Phoenix Operations Group is a high-end engineering services company dedicated to protecting and advancing our national cyber resources. As a small company, we rely on innovation to continually advance our employees' skills and provide game-changing solutions to our customers., Phoenix Operations Group is an Equal Opportunity Employer. Phoenix Operations Group does not discriminate based on race, religion, color, sex, gender, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status, or any other basis covered by appropriate law. All employment is decided based on qualifications, merit, and business needs.

Apply for this position