Sr. Systems Engineer (Linux/HPC Environment) #11241
Role details
Job location
Tech stack
Job description
The Senior Systems Engineer will play a key role in supporting and maintaining a Linux-based high-performance computing (HPC) environment. This platform enables advanced analytics, statistical modeling, and research across multiple business functions. The engineer will be responsible for ensuring system reliability, performance, and security while partnering with cross-functional teams to deliver scalable technical solutions., System Administration
· Administer and maintain Linux-based high-performance computing systems
· Perform system updates, patching, and security hardening
· Monitor system performance and optimize resources for high availability and efficiency
Platform Support
· Provide advanced (Tier 3) technical support for complex platform-related issues
· Troubleshoot and resolve system outages or performance degradation with minimal downtime
· Translate business and analytical requirements into technical solutions
Collaboration & Communication
· Partner with data engineers, data scientists, analysts, and other stakeholders
· Document system configurations, processes, and troubleshooting procedures
· Contribute to knowledge sharing and continuous improvement efforts
Security & Compliance
· Implement and maintain security best practices and protocols
· Conduct vulnerability assessments and system audits
· Ensure systems align with organizational and regulatory security standards
Project & Engineering Support
· Participate in system enhancements, upgrades, and performance optimization initiatives
· Contribute to system architecture and design decisions
· Support implementation of new tools, features, and capabilities
Requirements
· Strong expertise in Linux system administration and shell scripting
· Hands-on experience with Ansible and automation frameworks (e.g., Ansible Automation Platform)
· Experience supporting high-performance computing environments (e.g., SLURM, Open OnDemand)
· Familiarity with analytical and statistical tools such as Python, R, MATLAB, SAS, or similar
· Strong troubleshooting and problem-solving skills
· Excellent communication skills and ability to work across technical and business teams, * Are you currently in the DMV region and can work onsite?
- Do you have a strong understanding of Linux operating systems, shell scripting, and system administration tools?
- Are you knowledgeable with Ansible and Ansible Automation Platform?
- Any experience with high-performance computing environments leveraging technologies such as SLURM and OpenOnDemand, and statistical analysis tools such as R, Python, MatLab, Stata, or SAS?
Benefits & conditions
- 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Paid time off
- Vision insurance