HPC Engineer

Sabre Systems, Inc.
Dallas, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Dallas, United States of America

Tech stack

Backup Devices
Big Data
C++
Computer Engineering
System Configuration
Scientific Data Archiving
File Systems
Distributed Systems
Middleware
Perl
Fortran
Job Scheduling
Python
Lua
Network Monitoring
Ansible
Sabre (Computer System)
Tcl (Programming Language)
Virtualization Technology
Data Storage Technologies
High Performance Computing
Parallel Computation
Gitlab
Computer Equipment
Kubernetes
Information Technology
Slurm
Docker
VMware
Programming Languages

Job description

Job title: HPC Engineer Sabre is seeking an HPC Data Storage Engineer to support a mission-critical Department of Defense (DoD) program dedicated to high-performance computing operations. As an HPC Engineer, you will design, optimize, and maintain advanced high-performance computing environments that power large-scale data processing, simulation, and research operation. Your contributions will directly enable advanced data-intensive research efforts that are essential to national defense Duties include but not limited to:

  • Utilize a wide variety of skills in system and network monitoring; large-scale systems administration; scripting and automation; security compliance; network distributed services; storage and backups; and hardware and software problem diagnosis and resolution.
  • Diagnose and troubleshoot technical problems, often of a complex nature, associated with computer hardware and software interrelationships and dependencies.
  • Conduct needs analysis, planning, and scheduling the installation of a wide variety of new or modified hardware/software.
  • Develop functional and technical IT system requirements and specifications. Configure and optimize system tools and applications, to include job schedulers (Slurm and PBSPro) and system resources (GitLab, LUA/TCL modules, and system support applications).
  • Create and brief technical presentations to technical and non-technical stakeholders. Maintain detailed documentation of system configurations, procedures, and troubleshooting guides. Develop user facing documentation

Requirements

Do you have experience in Virtualization tools?, Do you have a Bachelor's degree?, * Bachelor's in Computer Engineering, Computer Science, or related field and ten or more years of job related experience.

  • Thorough knowledge of complex concepts, practices, and troubleshooting associated with HPC cluster systems design, installation, and maintenance.
  • Advanced knowledge in distributed computing theory, parallel processing, applications, and associated infrastructure is required.
  • Extensive experience with Linux/Unix systems including installation, configuration, networking, backups, updates and patching, data archiving, and system security. Functional knowledge of HPC middleware, and platform managers such as Bright Cluster Manager; employing job schedulers such as PBS, Slurm, Torque, etc.; and, optimizing job queues.
  • Experience with HPC or large-scale distributed computing environments and technologies such as high-speed low-latency interconnects (e.g. InifiniBand), parallel file systems (e.g. Lustre), and virtualization environments and tools (e.g. VMWare).
  • Experience developing Python/bash/Perl scripts and employing automation frameworks such as Ansible.
  • General knowledge employing Docker containers and Kubernetes ecosystems.
  • Working knowledge in one or more programming languages (e.g. C/C++, Fortran, etc.)
  • Must be able and willing to travel to northern Virginia approximately 25% of the time

Clearance Requirements:

  • This position requires an active Top Secret DoD security clearance (U.S. Citizenship Required)

About the company

Sabre Systems, LLC, has been providing innovative technological solutions and services for Department of Defense, Federal Civilian, and commercial customers for more than 35 years. We support the ever-evolving areas of advanced communication technologies, cyber, systems and software engineering, and digital transformation. With over three decades in business, Sabre Systems, LLC remains committed to our small business values and a people-first philosophy. We foster a welcoming, inclusive culture that values diverse perspectives and encourages open communication. Our collaborative environment supports continuous learning and professional growth at all levels. We prioritize the health, well-being, and success of our employees, offering comprehensive, evolving benefits designed to meet their diverse needs. Join us and be part of a thriving, people-driven culture.

Apply for this position