HPC Support Engineer - TS/SCI Required

THE PHOENIX

Charlottesville, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Charlottesville, United States of America

Tech stack

Bash

Big Data

C++

Command-Line Interface

Nvidia CUDA

Linux

Distributed Computing Environment

Fortran

InfiniBand

Python

Linux System Administration

Node.js

OpenMP

Open Source Technology

Performance Tuning

Remote Direct Memory Access

Red Hat Enterprise Linux - RHEL

Data Streaming

Toolchain

Scripting (Bash/Python/Go/Ruby)

Cloud Platform System

High Performance Computing

Software Troubleshooting

Parallel Computation

Slurm

Engineering Base

Job description

Phoenix is seeking an HPC Support Engineer to support users executing computational workloads within advanced Linux-based High Performance Computing (HPC) environments. This role is essential to ensuring efficient, reliable execution of distributed workloads, scientific simulations, and GPU-accelerated processing.

You will work directly with users and systems in a cluster-scale environment, helping optimize job performance, troubleshoot issues, and promote HPC best practices.

What You'll Do

Support execution of distributed compute workloads on HPC clusters
Troubleshoot job failures and performance issues
Assist users with scheduler job submission scripts (e.g., Slurm, PBS)
Identify and resolve performance bottlenecks across compute workloads
Support GPU-enabled workloads and CUDA-based processing
Guide users on efficient cluster utilization and HPC best practices
Assist with application execution, compilation, and runtime issues
Develop and maintain automation scripts and tooling, Our technical competencies include Big Data analytics (batch and streaming), Cloud Computing infrastructure, multi-INT visualization, and enterprise architectures. We support operational missions (All-Source, Financial, CND) and serve as Product Owners for our open-source research initiatives.

Requirements

Active TS/SCI clearance
Ability to work onsite in Charlottesville, VA
5+ years of experience in Linux environments supporting HPC or distributed compute workloads
Experience executing or troubleshooting workloads using:

Slurm
PBS / PBS Pro
Torque or similar schedulers

Strong command-line Linux experience (RHEL preferred)
Experience with scripting or automation (Bash, Python, or similar)
Ability to obtain DoD 8140 (8570) IAT Level II certification, * Experience supporting HPC cluster environments
Experience with MPI, OpenMP, or parallel computing frameworks
Experience supporting GPU workloads and CUDA environments
Familiarity with scientific or engineering applications in HPC environments
Experience with C/C++ or Fortran and compiler toolchains (GCC, Intel, LLVM)
Experience troubleshooting application build or runtime issues
Experience supporting research labs, university HPC, or defense environments

Technical Environment

You'll work in a cutting-edge environment that includes:

Multi-node Linux HPC clusters
Workload schedulers (Slurm, PBS)
Distributed computing frameworks (MPI, OpenMP)
GPU-enabled compute (CUDA)
High-performance networking (RDMA, InfiniBand), * Scheduler expertise (Slurm, PBS, etc.)
Linux-based compute environment experience
Distributed workload performance tuning
Automation and scripting experience

Benefits & conditions

Paid training, Referral program, Health insurance, 401(k) matching, Paid time off, Vision insurance, Dental insurance, Life insurance, Medical, Dental, Vision Insurance - 100% Company Paid Premiums

STD, LTD, and Life Insurance - 100% Company paid

401K - Automatic 10% company contribution; no matching required

PTO - 4 weeks/year

Holidays - 11 paid/year

Birthdays off with pay

Referral Bonuses - Upfront AND Annually Recurring

Open Source Bonuses - Contribute to our Github projects

Professional Development - Paid training, Certifications, and Enrichment

About the company

Phoenix Operations Group is a high-end engineering services company dedicated to protecting and advancing our national cyber resources. As a small company, we rely on innovation to continually advance our employees' skills and provide game-changing solutions to our customers., Phoenix Operations Group is an Equal Opportunity Employer. Phoenix Operations Group does not discriminate based on race, religion, color, sex, gender, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status, or any other basis covered by appropriate law. All employment is decided based on qualifications, merit, and business needs.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all