Software Engineer in the area of Machine Learning

ETH Zürich
Zürich, Switzerland
5 days ago

Role details

Contract type
Temporary contract
Employment type
Part-time / full-time
Working hours
Regular working hours
Languages
English

Job location

Zürich, Switzerland

Tech stack

Agile Methodologies
Artificial Intelligence
Systems Engineering
C++
Software Quality
Nvidia CUDA
Computer Programming
Data Systems
Linux
DevOps
Distributed Systems
Data Flow Control
Python
Machine Learning
OpenMP
Package Management Systems
Performance Tuning
Scripting (Bash/Python/Go/Ruby)
Test Driven Development
PyTorch
Large Language Models
Kubernetes
Slurm

Job description

We are offering a contract initially limited to two years, which will provide the opportunity to contribute to a fast-evolving AI landscape in which CSCS plays a key role and to support high-impact initiatives both nationally and internationally. This includes contributions to the Swiss AI Initiative and similar programs, such as lending support for the development and release of the Apertus models., * Collaborate with researchers and users to understand and solve complex, real-world problems

  • Contribute to AI/ML projects, including large language model training, inference, fine-tuning, and HPC-accelerated workflows
  • Develop, maintain, and optimise software and systems, from core libraries programming to scripting and automation
  • Take ownership of high-impact tasks and see them through to completion while maintaining effective communication with stakeholders
  • Jump into ill-defined problems, explore solutions, and learn along the way

Requirements

Do you have experience in Python?, Do you have a Master's degree?, We are seeking a driven software engineer to work at the intersection of machine learning and high-performance computing, tackling complex, open-ended challenges to deliver scalable solutions. You will design and optimize a software-defined infrastructure that enables cutting-edge AI/ML projects in a high-performance and data-intensive environment.

We value technical excellence, curiosity, and the ability to learn and grow, rather than a initial perfect match of a skills checklist. If you are motivated to make an impact in this space but do not meet all requirements, we still strongly encourage you to apply., We welcome engineers with diverse backgrounds who are eager to contribute to our mission. We are primarily looking for strong technical foundations, sound engineering judgment, and the ability to bridge gaps across domains. Curiosity, adaptability, willingness to learn on the job, and potential for growth matter more to us than a perfect initial match of technical requirements.

The technologies listed below illustrate the breadth of our stack and areas of interest. They are not a checklist of requirements, and experience in only some of these areas is expected.

Technical environment and areas of interest:

  • Large-scale parallel and distributed systems, including performance tuning
  • Programming and tooling such as C/C++, Python, CUDA, OpenMP, and Spack
  • Linux-based systems, scripting, Slurm, and general systems engineering
  • Containerized and Kubernetes-based service deployment and operations
  • Large-scale machine learning and LLM workflows (e.g., PyTorch, Megatron, pre-training, fine-tuning, inference)
  • Storage and data systems (e.g., Lustre, NFS, VAST)
  • Collective communication and high-speed networking (e.g., NCCL, RCCL, MPI)
  • Monitoring and observability (e.g., DCGM, LDMS, metrics dataflow pipelines and data products development)
  • Testing frameworks, software quality practices, and DevOps/GitOps approaches

Personal qualities:

  • Self-motivated, proactive, focused, and collaborative
  • Strong problem-solving mindset and comfort tackling complex or ambiguous problems
  • Clear communicator with a strong sense of user needs
  • Open to learning new technologies and working across disciplines
  • Comfortable asking for help and engaging the right expertise when needed

Ways of working:

  • Ability to thrive in collaborative, self-organizing environments based on Agile principles
  • Experience with structured development practices such as test-driven development is a plus.

Our core values as guiding principles:

  • Curiosity: You enjoy learning, exploring new ideas, and understanding problems deeply
  • Openness: You listen, collaborate, and are receptive to different perspectives
  • Courage: You tackle challenging or ambiguous problems and are willing to take initiative
  • Supportive: You help colleagues and users succeed and contribute to a positive team culture
  • Integrity: You act honestly, ethically, and reliably in your work

Benefits & conditions

We are committed to building a diverse and inclusive engineering team and particularly encourage applications from groups underrepresented in tech. If you are technically adept, curious, and eager to grow, we want to hear from you.

  • Your job with impact: Become part of ETH Zurich, which not only supports your professional development, but also actively contributes to positive change in society
  • You can expect numerous benefits, such as public transport season tickets and car sharing, a wide range of sports offered by the ASVZ, childcare and attractive pension benefits
  • You can look forward to an exciting working environment, cultural diversity and attractive offers and benefits.
  • We value the diversity of our team and, to further enhance the diversity of our workforce, we particularly encourage women to apply.

About the company

The Swiss National Supercomputing Centre (CSCS) develops and operates a high-performance computing and data research infrastructure that supports world-class science in Switzerland. Its user laboratory is available to domestic and international researchers in academia, industry, and the business sector. The centre is operated by ETH Zurich and has offices at its data centre in Lugano and in Zurich.

Apply for this position