Senior Systems Administrator, High Performance Computing
Role details
Job location
Tech stack
Job description
The Center for Advanced Computing (CAC) at Cornell University seeks a senior systems administrator to design, deploy, and manage high performance computing (HPC) clusters and servers. Well-qualified candidates will possess multiple years of experience providing support in an HPC environment, will be proficient in troubleshooting hardware issues, and will be able to advance projects with minimal direction.
As Systems Administrator IV, you will build technology that allows Cornell researchers to conduct cutting edge research and advance the frontiers of science. In addition to supporting systems used by New York City-based units, Weill Cornell Medicine, and Weill Cornell Medicine - Qatar, you will collaborate with other team members working with researchers across the spectrum of academic disciplines at Cornell.
As part of the Cornell community, you will help foster a culture of belonging and a healthy work environment by communicating across differences; being cooperative, collaborative, open, and welcoming; showing respect, compassion, and empathy; engaging and supporting others regardless of background or perspective; speaking up when others are being excluded or treated inappropriately; and supporting work/life integration of oneself and others., * Leading efforts to implement emerging cloud and storage technologies
- Designing and making recommendations for hardware and software solutions for new clusters and servers for Cornell customers
- Integrating new systems into the existing heterogeneous computing environment
- Maintaining a secure and stable computing infrastructure
- Testing and evaluating new tools
- Developing training and documentation for use of systems managed by Center for Advanced Computing
Requirements
- Bachelor's degree in computer science or a related technical field with at least five years of experience managing Linux servers and HPC clusters, or an equivalent combination of education and relevant experience
- Working knowledge of cloud services and virtualization platforms
- A background in evaluating future technologies and providing recommendations for upgrades and installations
- Proven experience using compilers to build and debug scientific software installations
- Demonstrated skills in problem solving, critical thinking, and written and oral communications
- Demonstrated ability to interact with others on a collaborative team towards a shared project goal
- Experience in the administration of Linux core services such as SSH, LDAP, Samba, NFS, and Sudo
- Demonstrated scripting proficiency on various Linux platforms (Python, shell, Java)
- Demonstrated understanding and ability to address interoperability issues
- Ability to cultivate and develop inclusive and equitable working relationships with students, faculty, staff and community members
Additionally, although not required, the following would represent pluses:
- Advanced degree
- Experience with provisioning tools (warewulf), real-time monitoring tools (OpenXDMod), distributed and parallel file systems (CEPHFS, NFS, Lustre, BeeGFS), and virtualization platforms (Openstack), * Prior relevant work or industry experience
- Education level to the extent education is relevant to the position
- Unique applicable skills
- Academic Discipline