HPC Systems Engineer
Role details
Job location
Tech stack
Job description
HPC server systems are increasingly an essential and enabling component to the performance of advanced equipment supporting the next-generation of semiconductor manufacturing. Process control equipment rely on HPC server systems to effectively handle large amounts of data and apply state-of-the-art algorithms and software to derive useful and timely inspection and metrology results. As-such, these systems used in high-volume wafer and mask production need to be operating at near entitlement, and be cost effective, highly reliable and serviceable for 15+ years in the field., In this role, you will be responsible for the architectural design, deployment as well as support of an HPC cluster product used in IC fabs and mask-shops around the world. You will identify and assess developer requirements, devise solutions, and recommend, plan and drive these solutions to production.
Requirements
- In-depth knowledge and experience of Linux systems (SuSE, RedHat, Rocky, Ubuntu)
- Experience with architecting of crafting and maintaining robust storage
- Strong knowledge and extensive experience working in virtualization technology
- Strong HPC HW knowledge especially in the Server, GPU, Networking, Storage, Scheduler, BIOS & BMC arenas.
- Experience in System-D, Net boot/PXE, Linux HA.
- Strong understanding of TCP/IP fundamentals and knowledge of protocols, DNS, DHCP, HTTP, LDAP, SMTP.
- Strong with Storage File Shares: NFS/CIFS
- Ability to code and develop Shell and Python scripts.
- Experience with one or more of the listed Configuration Mgmt utilities. (Ansible, Salt, Chef, Puppet etc).
- Possess a strong DevOps focus: Knowledge of setting up a continuous development pipelines, Repository software (Git-based).
- Hypervisor Knowledge: VMWare, Proxmox, or XCP-ng
- Knowledge of Apache/Nginx, Setting up proxy/reverse proxy, application server routing, load balancing (HA Proxy)
- HPC Schedulers: SGE/SLURM
- Monitoring tools: Prometheus, Grafana, Nagios
- Database Technologies: MySQL, Doctorate (Academic) Degree and related work experience of 3 years; Master's Level Degree and related work experience of 6 years; Bachelor's Level Degree and related work experience of 8 years, in Computer Engineering, Electrical Engineering or a related field.
Benefits & conditions
Base Pay Range: $154,900.00 - $263,300.00 Annually
Primary Location: USA-CA-Milpitas-KLA
KLA's total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.
Interns are eligible for some of the benefits listed. Our pay ranges are determined by role, level, and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors, including state minimum pay wage rates, location, job-related skills, experience, and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable, your recruiter can share more about the specific pay range for your preferred location during the hiring process.