Ceph Software Engineer (IT-SD-GSS-2026-96-LD)
Role details
Job location
Tech stack
Job description
Work with us to evolve the CERN data storage systems used at the Large Hadron Collider (LHC) and at its international partners. CERN is the birthplace of the World Wide Web and one of the world's leading laboratories for particle physics. Join the Storage and Data Management Group in CERN's IT Department as a Software Engineer for a unique challenge as the next step in your career. CERN, take part! The CERN IT Storage and Data Management group operates the core services used by both LHC and non-LHC experiments for data collection, archival, reconstruction, analysis, and global distribution via the Worldwide LHC Computing Grid.
As a Ceph Software Engineer, you will contribute to the design, evolution, and operation of large-scale distributed critical storage services for the CERN cloud and computing infrastructures. You will work hands-on with Ceph-based block, object, and filesystem solutions (including CephFS and NFS integrations), ensuring high availability, performance, and resilience across multi-datacenter environments. Your role will involve troubleshooting complex system-wide issues, optimising storage architectures for demanding workloads such as HPC and AI/ML, and continuously improving automation, deployment, and observability, aligned with modern DevOps practices.
Leveraging strong Linux expertise and systems programming skills (e.g., C/C++, Go, or Rust) you will help evolve the distributed storage technology, while maintaining robust, scalable and secure systems.
Functions
- Co-lead management and operations of distributed disk storage, block, object and filesystem services based on mainstream open-source technologies (Ceph, NFS).
- Participate in the evolution of architecture and design of storage services for CERN cloud and compute infrastructure, as well as core business applications (physics data processing, ML/AI, HPC use-cases).
- Integrate, troubleshoot and maintain distributed disk storage systems at scale, across multiple availability zones and data centres.
- Contribute to documentation, development, optimisation and further automation of storage services.
- Contribute to change management, incident response, and user support.
- Liaise with key stakeholders inside and outside of the IT department.
Requirements
Master's degree or equivalent relevant experience in the field of Computer Science or a related field., * Deep understanding of Linux and architecture of storage and filesystems (e.g. NFS, CephFS), including high availability and failure-domain.
- Knowledge of POSIX permissions model, POSIX ACLs and inheritance, and authentication/authorisation concepts (CephX, Kerberos for NFS).
- Proficiency in at least one systems programming language, ideally C/C++, or other high-performance language (e.g. Rust, Golang).
- Good knowledge of scripting languages (e.g. Python, shell) to automate deployment and testing is also required.
- Solid debugging skills for troubleshooting of complex distributed environments and performance tuning.
- Experience in diagnosing complex, system-wide issues which span hardware, network, and software layers.
- DevOps skills (CI/CD, Gitlab, containerisation), monitoring, and system observability (Prometheus, Grafana, or similar).
- Strong collaboration and communication skills to work effectively with multiple cross-functional teams, including infrastructure, application, and end-user communities.
Nice-to-have Skills
- Familiarity with NFS-Ganesha.
- Familiarity with rsync/rclone, snapshots, snapshot-based and incremental replication and filesystem-native migration tooling.
- Familiarity with inotify for change tracking, bind mounts, and Linux VFS semantics.
Technical competencies:
- Design of storage systems.
- Development of application software.
- Knowledge of programming techniques and languages.
- Knowledge of storage technologies.
- Operation and maintenance (preventive and corrective) of storage systems.
Behavioural competencies:
- Achieving Results: delivering prompt and efficient service taking into account customer needs.
- Demonstrating Flexibility: readily absorbing new techniques and working practices; proposing new or improved ways of working.
- Solving Problems: seeking and integrating other points of view when tackling an issue; consulting experts in the field and undertaking benchmarking.
- Working in Teams: building and maintaining constructive and effective work relationships.
- Learning and Sharing Knowledge: seeking feedback from colleagues and other stakeholders about ways of increasing competence.
Language skills:
Spoken and written English, with a commitment to learn French.
Benefits & conditions
Contract type: Limited duration contract (5 years). Subject to certain conditions, holders of limited-duration contracts may apply for an indefinite position.
Working Hours: 40 hours per week
Job Flexibility: Hybrid
This position involves:
- Work during nights, Sundays and official holidays, when required by the needs of the Organization.
- Stand-by duty, when required by the needs of the Organization., * A competitive salary (tax free), increasing in line with your years of relevant experience.
- 30 days of paid leave per year plus 2 weeks annual closure.
- Coverage by CERN's comprehensive health insurance scheme (for yourself, your spouse and children), and membership of the CERN Pension Fund.
- Family, child and infant monthly allowances depending on your individual circumstances.
- A relocation package (installation grant, removal, travel expenses) depending on your individual circumstances.
- Possibility to extend your contract up to 8 years + eligibility for indefinite contract tenure.
About the company
Imagine taking part in the largest scientific experiment in the world. CERN needs more than physicists and engineers - if you're a student, a graduate, just starting your career or an experienced professional, whatever your field of expertise, CERN could be your next opportunity.