System Software Engineer - Slurm
Role details
Job location
Tech stack
Job description
We are seeking an expert System Software Engineer with a strong background in Linux systems programming, systems administration, and technical support. This hybrid role combines software development, system-level troubleshooting, and direct support for internal or customer-facing environments. You will design and maintain Slurm written in C, diagnose sophisticated system issues, and collaborate with other engineers to ensure Slurm runs optimally and efficiently. The ideal candidate is equally comfortable writing efficient, reliable, and maintainable code, analyzing systems performance, and supporting production environments.
What you'll be doing:
- Maintain, improve and optimize software components in C
- Develop and maintain system-level and application-level code for Slurm which includes networking, system and device level components
- Debug and troubleshoot complex Slurm issues related to reliability and performance
- Write clean, maintainable, and well-documented code that adheres to industry standards
- Collaborate with cross-functional teams including Operations, Infrastructure, and Deployment
- Provide direct technical support to internal teams or external customers
- Develop automated tests to ensure software reliability and regression prevention
- Stay current with best practices in C programming, compilers, build systems, and related technologies
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience)
- 5+ Years of professional experience in C development
- Strong understanding of memory management, pointers, data structures, and algorithms
- Experience with debugging tools such as GDB and performance profiling
- Solid understanding of Linux kernel interfaces, system calls, and file system including work with Automake
- Understanding of software development lifecycles and agile methodologies. Strong problem-solving and analytical skills
- An environment with a focus on quality and reliability
- Experience with containers and GPU technologies
- Curious, self-motivated, and eager to learn new technologies
Ways to stand out from the crowd:
- Experience with C and other low-level languages. Background in system administration or High Performance Computing
- Experience with Slurm Workload Manager or other HPC scheduling systems
- Knowledge of operating system internals or hardware-software interaction.
- Contributions to open-source C projects are a plus