HPC System Administrator
FH Campus Wien
Vienna, Austria
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Compensation
€ 56KJob location
Vienna, Austria
Tech stack
Bash
Configuration Management
Debian Linux
File Systems
Python
Linux System Administration
Red Hat Enterprise Linux - RHEL
Ansible
Weka
Ceph
Scripting (Bash/Python/Go/Ruby)
Reliability of Systems
Information Technology
Slurm
Puppet
Job description
We are seeking a skilled and proactive High-Performance Computing (HPC) administrator to join our team, responsible for managing and optimizing our high-performance computing environment. If you thrive in a dynamic and collaborative environment where your expertise is crucial in supporting cutting-edge research and scientific advancements, this opportunity is for you.
- Cluster Management: Oversee and manage daily operations of the compute infrastructure, including configuration, deployment, and optimization of nodes and networks to maximize performance for our userbase
- System Monitoring and Maintenance: Monitor system performance, storage, and network utilization to ensure the systems operate efficiently. Address hardware and software issues as they arise
- User Support: Provide technical assistance to researchers on efficient use of cluster resources
- Documentation and Reporting: Create and maintain comprehensive documentation on system configuration, maintenance tasks, and troubleshooting procedures. Generate regular reports on system performance, uptime, and resource usage for management
Requirements
- Education and Experience: Education in Computer Science, Information Technology, or related field (or equivalent experience)
- Technical Skills: Proficiency in HPC cluster management tools (e.g., SLURM, PBS, or Torque), Linux system administration (Debian, RHEL)
- Scripting and Automation: Strong scripting skills in Python, Bash, or other languages. Experience with automation tools for configuration management (e.g. Ansible, Puppet, Chef) to automate tasks, optimize processes, and improve system reliability
- Networking and Storage: Solid understanding of high-speed networking, parallel file systems, and large-scale storage solutions (e.g., Lustre, Ceph, Weka, Beegfs)
- Problem-Solving: Excellent troubleshooting abilities and a proactive approach to resolving system issues before they impact users
- Independent, result driven work, demonstrates ownership and accountability
About the company
Institute of Science and Technology Austria (ISTA) is a constantly growing international institute for conducting world-class research in mathematics, computer science, life sciences & physical sciences. We strive to recruit passionate professionals from across the world over all fields who strive to support our goal of excellent research. Located within a beautiful campus on the outskirts of Vienna, we offer multiple opportunities for personal growth in a stable working environment.