HPC Systems Engineer - TS/SCI Required job in Charlottesville

THE PHOENIX

Charlottesville, United States of America

3 months ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Charlottesville, United States of America

Tech stack

Systems Engineering

Bash

Big Data

Command-Line Interface

Cloud Computing

Configuration Management

Nvidia CUDA

Linux

Distributed File Systems

General Parallel File Systems

InfiniBand

Python

Network Layer

Linux System Administration

Node.js

OpenMP

Open Source Technology

Parallel Computing

Performance Tuning

Remote Direct Memory Access

Red Hat Enterprise Linux - RHEL

Ansible

Scientific Computating

Data Streaming

Scripting (Bash/Python/Go/Ruby)

High Performance Computing

Slurm

Puppet

Docker

Job description

Phoenix is seeking a High Performance Computing (HPC) Systems Engineer to support the build, configuration, and sustainment of advanced Linux-based HPC cluster environments. This role is critical to enabling distributed compute workloads, scientific simulations, and GPU-accelerated processing within a secure research environment.

You will work in a cluster-scale computing environment where performance optimization, scheduler configuration, and distributed workload execution are key to mission success.

What You'll Do

Configure, deploy, and maintain multi-node Linux HPC clusters
Administer and optimize workload schedulers (e.g., Slurm, PBS)
Troubleshoot distributed compute workloads across cluster environments
Perform performance analysis across compute, storage, and network layers
Support GPU-enabled workloads and CUDA-based processing
Develop and maintain automation scripts and operational tooling
Assist in cluster provisioning and node deployment (e.g., xCAT, Warewulf)
Support containerized workloads within HPC environments, Our technical competencies include Big Data analytics (batch and streaming), Cloud Computing infrastructure, multi-INT visualization, and enterprise architectures. We support operational missions (All-Source, Financial, CND) and serve as Product Owners for our open-source research initiatives.

Requirements

Active TS/SCI clearance
Ability to work onsite in Charlottesville, VA
6+ years of Linux systems administration experience
Hands-on experience with HPC clusters or distributed compute environments
Experience with workload schedulers such as:

Slurm
PBS / PBS Pro
Torque or similar

Strong command-line Linux administration skills (RHEL preferred)
Experience with scripting or automation (Bash, Python, or similar)
Ability to obtain DoD 8140 (8570) IAT Level II certification, * Experience administering multi-node HPC cluster environments
Familiarity with parallel/distributed file systems (Lustre, BeeGFS, GPFS)
Experience with MPI, OpenMP, or other parallel computing frameworks
Experience supporting GPU compute environments (CUDA)
Familiarity with container technologies:

Docker, Podman, Singularity/Apptainer

Experience with configuration management tools (Ansible, Puppet)
Background supporting research labs, university HPC, or defense environments

Technical Environment

You'll work with cutting-edge technologies, including:

Linux-based HPC clusters
High-performance networking (RDMA, InfiniBand)
Distributed compute frameworks (MPI, OpenMP)
GPU-enabled processing (CUDA)
Cluster provisioning tools (xCAT, Warewulf), * HPC cluster administration
Research computing or university HPC centers
National labs or scientific computing programs
Defense or intelligence community computing environments, * Scheduler expertise (Slurm, PBS, etc.)
Linux administration in multi-node environments
Troubleshooting distributed workloads
Automation and scripting experience

Benefits & conditions

Medical, Dental, Vision Insurance - 100% Company Paid Premiums

STD, LTD, and Life Insurance - 100% Company paid

401K - Automatic 10% company contribution no matching required

PTO - 4 weeks/year

Holidays - 11 paid/year

Birthdays off with pay

Referral Bonuses - Upfront AND Annually Recurring

Open Source Bonuses - Contribute to our Github projects

Professional Development - Paid training, Certifications, and Enrichment

About the company

Phoenix Operations Group is a high-end engineering services company dedicated to protecting and advancing our national cyber resources. As a small company, we rely on innovation to continually advance our employees' skills and provide game-changing solutions to our customers., Phoenix Operations Group is an Equal Opportunity Employer. Phoenix Operations Group does not discriminate based on race, religion, color, sex, gender, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status, or any other basis covered by appropriate law. All employment is decided based on qualifications, merit, and business needs.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all