Senior Systems Engineer (Linux / HPC Environment)

Stellent IT LLC
Washington, United States of America
3 days ago

Role details

Contract type
Temporary to permanent
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Washington, United States of America

Tech stack

Systems Engineering
Bash
Configuration Management
Linux
Document Management Systems
R
Python
Linux System Administration
Matlab
Performance Tuning
Ansible
SAS (Software)
Scripting (Bash/Python/Go/Ruby)
High Performance Computing
System Availability
Reliability of Systems
Information Technology
Patch Management
Slurm
3-tier Architectures
Vulnerability Analysis

Job description

  • Extensive hands-on experience in Linux system administration, including scripting with Bash or other shells.

  • Proficiency with automation frameworks such as Ansible (including Ansible Automation Platform) for configuration management and deployment.

  • Background supporting high-performance computing environments, such as systems utilizing SLURM workload manager or Open OnDemand portals.

  • Understanding of analytical and statistical software tools such as Python, R, MATLAB, SAS, or similar platforms., As a Senior Systems Engineer, you will play a critical role in supporting, administering, and maintaining a Linux-based high-performance computing (HPC) environment that underpins advanced analytics, statistical modeling, and research activities. Your primary goal will be to ensure system reliability, security, and top-tier performance while working with cross-functional teams to deliver scalable technical solutions for evolving business needs. Key areas of responsibility include:

  • System Administration:

  • Administer and maintain Linux-based HPC systems.

  • Perform regular system updates, patch management, and robust security hardening.

  • Monitor, tune, and optimize system performance to ensure high availability and efficiency.

  • Platform Support:

  • Provide advanced (Tier 3) technical support for complex HPC platform issues.

  • Troubleshoot and resolve system outages or performance issues with minimal downtime.

  • Interpret business and analytical requirements into workable technical solutions.

  • Collaboration & Communication:

  • Partner closely with data engineers, data scientists, analysts, and various stakeholders to understand and address their technology needs.

  • Document system configurations, processes, troubleshooting steps, and incident resolutions.

  • Drive knowledge sharing and support continuous process improvement activities.

  • Security & Compliance:

  • Implement and maintain security best practices, protocols, and regular audits.

  • Conduct vulnerability assessments to mitigate risks and protect sensitive data.

  • Ensure all systems adhere to organizational and regulatory compliance standards.

  • Project & Engineering Support:

  • Engage in system enhancements, upgrades, and performance initiatives to keep pace with technology advances.

  • Support system architecture and design decisions for both new and existing platforms.

  • Assist with the implementation and integration of new tools, features, and capabilities.

  • On-Call Support:

  • Participate in an on-call rotation to support critical systems and ensure maximum uptime.

Requirements

  • Extensive hands-on experience in Linux system administration, including scripting with Bash or other shells.
  • Proficiency with automation frameworks such as Ansible (including Ansible Automation Platform) for configuration management and deployment.
  • Background supporting high-performance computing environments, such as systems utilizing SLURM workload manager or Open OnDemand portals.
  • Understanding of analytical and statistical software tools such as Python, R, MATLAB, SAS, or similar platforms.
  • Exceptional troubleshooting and root-cause analysis skills, with the ability to resolve complex technical issues under pressure.
  • Highly effective communication skills, enabling collaboration with both technical specialists and business teams.
  • Commitment to security best practices and experience with vulnerability assessments in enterprise environments.
  • Strong documentation skills with a focus on process consistency and incident management., * Bachelor's degree in Computer Science, Information Technology, Engineering, or a relevant technical field (or equivalent experience).
  • Prior experience in system engineering roles within computational, research, or analytics-driven organizations is strongly preferred.
  • U.S. Citizenship is required due tp ongoing project needs.
  • Ability to work onsite as required; onsite engagement is full-time unless otherwise specified.

Willingness to participate in on-call rotations to ensure system uptime and reliability

About the company

© 2026 Careerjet All rights reserved

Apply for this position